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Preface 


A common game in any school playground is to invent a special alphabet for send- 
ing and receiving secret messages. The effort devoted to these childhood codes has 
much more to do with the enthusiasm of the would-be spies than the threat of 
any third party snooping on the information being transmitted. In the adult world, 
however, such unwanted interest clearly exists, and the confidentiality of many 
communications is extraordinarily important. 

Once limited to the activities of a political and social elite, the arrival of the 
information age has made codes and ciphers essential to the smooth functioning of 
society as a whole. This book attempts to explain the history of secret codes from 
the point of view of the most qualified of guides: mathematics. 

Cryptography, that is, the art of writing in code, appeared alongside writing 
itself. Although the Egyptians and Mesopotamians made use of encryption meth- 
ods, the first to apply themselves to it fully were the early Greeks and the Romans, 
aggressive cultures for whom communicating in secret was a key element of their 
military success. Such secrecy brought about new kinds of adversaries — those who 
declare themselves the keepers of the secret, the cryptographers, and those who hope 
to reveal it, the cryptanalysts or code-breakers. This is always a battle carried out 
behind the scenes, which, over time may give the advantage temporarily to one 
side or the other, but never reaches a decisive victory. In the 8th century, for 
example, the Arab sage, Al-Kindi, came up with one deciphering tool known as 
frequency analysis, which looked as though it could foil anyone writing in code. 
The eventual response (it took centuries to appear) of encoders was the polyalpha- 
betic cipher. It, too, seemed to be a decisive weapon... until a more sophisticated 
code-breaking system, devised by an English genius in the privacy of his study, once 
again turned the advantage. Ever since, the principal weapon employed by one side 
or the other has been mathematics, from statistics to modular arithmetic, by way 
of number theory. 

This encoding and deciphering battle reached a turning point with the 
appearance of the first encryption machines, followed not long after by machines 
devoted to decoding. The first programmable digital computer, named Colossus, 
was invented and built by the British to crack coded messages from Enigma, the 
German encoding device. 

With the explosion of computing power, codes acquired a leading role in the 


transmission of information beyond the traditional considerations of secrecy. The 
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universal language of modern society does not use letters or ideograms, but two 
digits — 0 and 1.This is the binary code. 

Which side benefited the most from the arrival of the new technologies, the 
cryptographers or the cryptanalysts? Is security still possible in this age of viruses, data 
theft and supercomputers? The answer to the second question is very much yes, and 
again we have to thank mathematics, in this case prime numbers and their particular 
characteristics. How long will this momentary dominance of the secret last? The 
answer to this question will take us to the furthest frontiers of contemporary science, 
to the theories of quantum mechanics, where astounding paradoxes will mark the 
end of this exciting journey through the mathematics of security and secrecy. 

This book ends with a bibliography for those who wish to go deeper into the 
world of encoding and cryptography, and the index will aid in the search. 


Chapter 1 
How Secure 
is Information? 


Cryptography: the art of writing or solving codes 
Oxford Dictionary 


The desire to create a message that can only be understood by the sender and 
its recipient — and is meaningless to any other person — is arguably as ancient as 
writing itself. In fact, there exists a series of “nonstandard” hieroglyphics 
that are more than 4,500 years old, although we do not know with any cer- 
tainty whether they represented an attempt to conceal information or 
were instead playing a part in some kind of ritual. We know more about a 
Babylonian tablet dated around 2,500 sc. It contains words with the first conso- 
nant removed and employs some unusual variants of characters. Investigations have 
revealed that the text describes a method of producing glazed ceramic, which leads 
us to conclude that it was engraved by a merchant or perhaps a potter who was 
concerned to protect trade secrets from competitors. 

With the spread of writing and trade came the birth of great empires, which 
in turn were engaged in frequent border disputes. Cryptography and the secure 
transmission of information became a priority for governments as well as merchants. 
‘Today, in the information age, the need to protect the integrity of communication 
and to maintain an appropriate level of privacy is more important than ever. There 
is scarcely any flow of information that is not encoded in one way or another. The 
purpose of the code is to make it easier to send. For example, text is converted 
into the binary language (a numerical system using just 0 and 1) intelligible to a 
computer. Once encoded, most of this information can be protected from anyone 
that might intercept it. In other words, the code needs to be encrypted. Finally, the 
legitimate recipient has to be able to decipher the message. Encoding, encrypting 
and deciphering are the basic steps in the “dance of information” that is repeated 
millions of times per second of every minute, of every hour of every day. And the 


music that accompanies this dance is none other than mathematics. 
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Codes, ciphers and keys 


Cryptographers use the term encode in a slightly different sense from 
the rest of us. For them, encoding is a method of writing in code that consists 
of substituting one word for another. On the other hand, using encryption or a 
cipher involved substituting letters or other single characters. With the passage of 
time, the latter form has become so prevalent that it has become synonymous with 
“writing in code”. However, if we follow the more scholarly interpretation, the 
correct term for the second method would be to encrypt (or decrypt, in the case 
of the reverse process) a message. 

Let’s imagine we are sending a secure message “ATTACK”. We could do so in 
two basic ways: substituting the word (code), or substituting some or all of the letters 
that make up the word (cipher). A simple way to encode a word is to translate it into 
a language that the potential eavesdroppers won’t know, whereas with encryption 
it would be sufficient, for example, to substitute each letter with another located 
elsewhere in the alphabet. In each case, it is necessary that the receiver knows the 
procedure that has been employed to encode or encrypt the message, or our mes- 
sage will be useless. If we had already agreed with the recipient that we would use 
one method or the other — translate it to another language or substitute each letter 
with another — all we would need to do would be to inform him or her of the 
targeted language or the number of places we have moved forward in the alphabet 
to substitute each letter. In an encrypted example, if the recipient gets the message 
“CVVCEM” and knows that we have moved each letter forward two places, he can 


easily reverse the process and decrypt the message. 


THE BINARY CODE _ 


For a computer to understand and process information, it must be 
translated from the language in which it is written into the so-called 
binary language. This language consists of two digits only: 0 and 1. The 
binary expression for 0-10 in the decimal system is shown in the table 


on the right. 


Consequently, the decimal number 9,780 would be expressed, in binary 
code, as 10011000110100. | : 
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TO TRANSLATE OR TO DECRYPT? 


Translations of text written in a language using an 
unknown character set can be approached as a | 
general problem of decryption. The translation can 
be seen as the unknown text already translated into 
our language, and the encrypting algorithm would 
be the grammatical rules and syntax of the origi- 
nal language. The techniques used for both tasks 
— to translate and decrypt — have many similarities. 
In both cases the same condition needs to be met: 


the sender and the receiver must, at the least, share 


a common language. That is why the translation of 
texts written in lost languages, such as the Egyptian 
hieroglyphic or Linear B, was impossible until a way of corresponding them to a known lan- 
guage was found. In both cases, this was Ancient Greek. The picture above is of a tablet found 


in Crete written in Linear B. 


The distinction we have established between the encryption rule (the system 
being applied) and the parameter of encryption (a variable instruction that is spe- 
cific to each message or a group of messages) is extremely useful because a potential 
spy would need to know both to decipher the message. Thus the spy could know 
that the key to the cipher is to substitute each letter with the corresponding letter 
a specific number, x, places further forward in the alphabet. However, if he does 
not know what x is, he will have to try all possible combinations: one for each 
letter of the alphabet. In this example, the cipher is very simple and to exhaust all 
the possibilities - what is known as brute-force decryption — is not particularly 
laborious. However, in the case of more complex techniques, this type of code- 
breaking, or cryptanalysis, is practically impossible, by hand at any rate. Moreover, 
the interception and deciphering of messages are both generally subject to impor- 
tant time restrictions. The information has to be obtained and understood before 


it becomes useless or widely known by others. 
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What is the. minimum + number of — needed | in a 1 system with two. ) users? Three? Four? For : 
two users to communicate with each other secretly, only one code or key i is necessary. in the 
case of three users, three are needed: one for the communication between A and B, another 

i for the pair A and G, and a third for B and C. Similarly, four users would require six keys, Thus 

: to generalise, for n users we would need. as many keys as there are combinations of pairs of n 


: users, that i is: — oo 


ss So. a telatively small system of 10, 000 interconnected users would require 49, 995, 000 
' keys. In the case of a world population of six billion individuals, the number is sizing: 
| 17,999,999, 997, 000, 0009, 000. 2. : | | 


The general rule of encryption is often termed the encryption algorithm, while 
the specific parameter used to cipher or encode the message is termed the key. (In 
the ciphering example on page 10, for example, the key is 2. Each original letter 
is replaced by another two places further on in the alphabet). Obviously a great 
number of keys are possible for every encryption algorithm, and so knowing the 
algorithm alone can be a good as useless unless we also know the key used to encrypt 
it. Since the keys are generally easier to change and to disseminate, it seems logical 
to concentrate on keeping the keys most secret in order to maintain the security of 
an encryption system. This principle was established at the end of the 19th century 
by the Dutch linguist Auguste Kerckhofts von Nieuwenhof, and is thus known as 
Kerckhoffs’ principle. 


HOW SECURE IS INFORMATION? 


To summarise what we have presented to this point, we can set out a general 


system of encryption defined by the following elements: 


Algorithm + key Algorithm + key 


That is, a sender and a recipient of the message, an encryption algorithm and a 
defined key that allows the sender to cipher the message and the receiver to decipher 
it. Later, we will see how this diagram has been modified in recent times because 
of the changing nature and function of keys, but for the time being we will stick 


to this diagram. 


Private keys and Public keys 


Kerckhoffs’ principle establishes the key as the fundamental element in the security 
of any cryptographic system. Until relatively recently, the keys of a sender and a 
receiver in all conceivable cryptographic systems needed to be identical or at least 
symmetrical, that is, they needed to be used for both the encryption and decryp- 
tion of a message. The key was, therefore, a secret shared by the sender and the 
recipient, and thus the cryptographic system in use was always vulnerable, so to 
speak, from both sides. This type of cryptography, which is dependent on a key 


shared by the sender and the receiver, is known as a private key. 


All cryptographic systems invented by humans since the beginning of time, 


irrespective of the algorithm used and its complexity, shared this characteristic. 


| HOW MANY KEYS, ARE REQUIRED?, ee PART > 


: As we have seen on page iz dlassical cryptography required an enormous number of keys. 
However, ina public cryptographic system any two users who exchange messages only need | 
a four of them: their respective public and private keys. in this case n users require 2n keys. 
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Making the key the same for the recipient and the sender seems to be com- 
mon sense. After all, how can one person encode a message according to one 
code, and a second decipher it according to another and hope that the meaning 
of the text is retained? For thousands of years this possibility was considered a 
logical absurdity. However, as we shall see in more detail later, just five decades 
ago the absurd became entirely possible, and is now a ubiquitous part of codes. 
Nowadays, encryption algorithms used in the majority of communications 
consist of at least two keys: a secret, private one, as was already customary, and 
a public one known by everyone. The transmission mechanism is as follows: 
the sender gets the public key of the recipient to whom he wishes to send the 
message and uses it to encrypt the message. The receiver takes his private key 
and uses it to decipher the received message. Moreover, this system possesses an 
extremely important additional advantage: neither the sender nor the recipient 
need to have got together in advance to agree on any of the keys involved, so 
the security of the system is very much tighter than was possible before. This 
completely revolutionary form of encryption is known as public key, and forms 
the basis of the security underlying today’s communication networks. 
Mathematics is at the root of this revolutionary technology. In effect, as we 
shall explain in detail later on, modern cryptography sits on two foundations. 
The first is modular arithmetic, while the second is number theory — particularly 


the part concerning the study of prime numbers. 


The Zimmermann telegram 


Cryptography is one of the areas of applied mathematics in which the contrast 
between the pristine clarity of the underlying theory and the murky consequences 
of its implementation are most apparent. After all, the destiny of entire nations 
depends on the success or the failure of maintaining secure communications. One 
of the most spectacular examples of how cryptography changed the course of 
history occurred almost a century ago, in what became known as the Zimmer- 
mann telegram affair. 

On May 7, 1915, with half of Europe engaged in bloody conflict, a German U- 
boat torpedoed the transatlantic passenger liner Lusitania, which was sailing under 
the British flag near the coast of Ireland. The result was one of history’s most infa- 
mous massacres: 1,198 civilians, 124 of whom were American, lost their lives. The 


news enraged public opinion in the United States, and the government of President 
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SRE REE PRR I IE kote 
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How The New York Times reported the sinking of the Lusitania. 


Woodrow Wilson warned his German counterparts that any similar act would lead 
to the immediate entry of the United States into the war on the Allied side. In ad- 
dition, Wilson demanded that German submarines surface before carrying out any 
attack so as to avoid the sinking of further civilian ships. The tactical advantage of 
the U-boat force was therefore seriously compromised. 

In November, 1916, Germany appointed Arthur Zimmermann, a man with a 
reputation for diplomacy, as its new foreign minister. The news was welcomed by 
the United States press, who saw his appointment as a good omen for relations 
between Germany and the USA. | 

In January, 1917, less than two years after the tragedy of Lusitania, and with the 
conflict at its peak, the German ambassador to Washington, Johann von Bernstorff, 
received the following coded telegram from Zimmermann, with instructions to 


deliver it in secret to his counterpart in Mexico, Heinrich von Eckardt: 


“We intend to begin on the first of February unrestricted submarine 
warfare. We shall endeavour in spite of this to keep the United States of 
America neutral. 


In the event of this not succeeding, we make Mexico a proposal of alliance 
on the following basis: make war together, make peace together, generous 
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financial support and an understanding on our part that Mexico is to recon- 
quer the lost territory in Texas, New Mexico, and Arizona. The settlement 
in detail is left to you [von Eckardt]. 


You will inform the President [of Mexico] of the above most secretly as 
soon as the outbreak of war with the United States of America 1s certain 
and add the suggestion that he should, on his own initiative, invite Japan to 
immediate adherence and at the same time mediate between Japan and 


ourselves. 


Please call the President’s attention to the fact that the ruthless employment 
of our submarines now offers the prospect of compelling England in a few 


months to make peace.’ 


If it had been made public, the certain consequence of this telegram would 
have been the outbreak of war between Germany and the United States. Although 
Kaiser Wilhelm II knew this would be inevitable once submarines operated without 
surfacing before an attack, he hoped that by then the United Kingdom would have 
capitulated and therefore there would be no conflict for the United States to join. 
Barring this circumstance, the active threat of Mexico along the southern border 
of the United States could equally dissuade that country from entering another 
conflict many miles away. Mexico, however, was going to need a certain amount of 
time to organise its forces. Therefore it was vital that Germany’s intentions remain 
secret long enough for the submarine warfare to tip the balance of the conflict in 


Germany’s favour. 


Room 40 gets to work 


The British government, however, had other plans. Shortly after the start of the 
war, they had cut the undersea telegraphic cables that connected Germany directly 
with the Western Hemisphere, so any electronic communications had to go via 
cables that the British could intercept. The United States, in an attempt to bring 
about a negotiated end to the conflict, had been allowing Germany to continue trans- 
mitting diplomatic messages. As a result, Zimmermann’s message was received intact by 
the German delegation in Washington DC. 
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Zimmermann’s telegram (top) forwarded by the German ambassador in Washington DC, Heinrich von 
Eckardt, to his counterpart in Mexico, with the deciphered version of the same telegram below it. 
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Part of the British decoding of Zimmermann’s telegram. In the lower part 
can be seen how the Germans, lacking a code for the word “Arizona”, 
encoded it in sections: AR, |Z, ON, A. 


The British government sent the intercepted message to its code-breaking 
department, known as Room 40. 

The Germans had used their normal foreign ministry encryption algorithm 
and had used a cipher known as 0075, which the experts of Room 40 had already 
partially broken. The algorithm in question involved the substitution of words 
(encoding) as well as letters (ciphering), a practice similar to that used in another 
of the encrypting tools used at that time by the Germans, the cipher ADFGVX, 
which we will examine in more detail later. | 

The British did not take long to decipher the telegram, although they were re- 
luctant to show it to the Americans right away. There were two reasons for this. First, 


the secret telegram had been transmitted under the diplomatic cover provided by 
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the United States to German messages, a privilege that the British had ignored. Sec- 
ond, if the telegram was made public, the German government would immediately 
know that its codes had been compromised and would change its system of encryp- 
tion. Therefore, the British decided to tell the Americans that the intercepted and 
decrypted version was the one forwarded by Eckardt to Mexico, and so convince the 
Germans that the telegram had been intercepted, already decrypted, in Mexico. 

At the end of February, Wilson’s government leaked the contents of the telegram 
to the press. Some members of the press — particularly the newspapers belonging 
to the Hearst group, which was anti-war and pro-German — were sceptical at first. 
However, by mid-March, Zimmermann publicly admitted to being the author of 
the controversial message. A little over two weeks later, on April 6, 1917, the US 
Congress declared war on Germany, a decision that would have far-reaching con- 
sequences for Europe and the world. 

Although extraordinary in its time, Zimmermann’s telegram is just one of the 
historical landmarks in which cryptography has played an essential role. Throughout 
this book we will see many other examples, scattered throughout the centuries and 
from all cultures. Even so, we can be almost certain that we do not know about 
many of the most crucial events. By its very nature, the history of cryptography is 


a secret history. 
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Cryptography from Antiquity 
to the 19th Century 


As we have already noted, cryptography is an ancient discipline, probably as ancient 
as written communication itself. However, it is not the only possible method for 
transmitting information in secret. After all, every text has to have a medium, and 
if we make the medium invisible to everyone except the recipient, we will have 
accomplished our objective. The technique of concealing the existence of the mes- 
sage itself is called steganography, and it probably originated around the same time 


and for the same reasons as cryptography. 


Steganography 


The Greek scholar Herodotus, considered one of the world’s greatest historians, 
mentions in his famous chronicle of the war between the Greeks and the Persians 
in the 5th century BC, two curious instances of steganography that reveal a consid- 
erable amount of ingenuity. In the first example, contained in Book III of Herodo- 
tus’s History, Histiaeus, the tyrant of Miletus, commanded a man to shave his head. 
He then wrote the message that he wanted to send on the man’s scalp and waited 
for his hair to grow back. The man was then sent to his destination, Aristagoras’ 
camp. Safely there, the messenger explained the ploy to Aristagoras and shaved his 
hair off again, revealing the long-awaited message. The second example, if true, is of 
greater historical importance because it allowed Demaratus, a Spartan king exiled 
in Persia, to warn his compatriots of an imminent invasion by the Persian king, 


Xerxes. Herodotus takes up the story in Book VII: 


“The fact was that Demaratus could not warn them just like that, so he had 
the following idea: he took a pair of [writing] tablets, scraped off the wax and 
wrote the king’s plans on the wooden surface of the tablets. He then covered 


them with melted wax, thus concealing the message. 


Zt ; 
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In this way the tablets, being apparently blank, would cause no trouble with 
the guards stationed along the road. 


When the tablets finally reached Lacedaemon (Sparta), the Lacedaemonians 
couldn’t figure out the secret until, as I understand it, Gorgo [...] suggested 
that they scrape the wax off the tablets because they — she indicated — would 


find a message engraved on the wood beneath.” 


A steganographic device that has stood the test of time is invisible ink. Celebrated 
in thousands of stories and films, the materials used — lemon juice, plant sap, and 
even human urine — are generally of organic origin and have a high carbon content, 
Therefore, they tend to darken when exposed to moderately high temperatures, such 
as the heat from a candle flame. 2 

Steganography’s usefulness is beyond dispute, although it is utterly unfeasible 
when dealing with large numbers of communications. Moreover, used on its own 
the technique has a significant flaw: if the message were to be intercepted, the con- 
tents would be immediately apparent. For this reason steganography is principally 
employed as a complement to cryptography, a means of strengthening the security 
of top secret transmissions. 

We can deduce from the examples given that armed conflict has been a great 
driver for the secure transmission of information. This being so, it is not surprising 
that a martial people such as the Spartans — if we believe Herodotus, already masters 


at steganography — would also be pioneers in the development of cryptography. 


Transposition cryptography 


In the conflict between the Spartans and the Athenians for control of the Peloponnese, 
frequentuse wasmade oflongstripsofpaperwrappedaroundacylinder,knownasascytale. 
A message was then written on the coiled paper. Even if the technique used (that is, the 
encryption algorithm) was known by the enemy, if the exact dimensions of the 
scytale were not known, anyone intercepting the message would find it extremely 
difficult to decipher its meaning. The thickness and length of the scytale were, in 
essence, the key to the encryption system. When the paper strip was unwound, the 


message became illegible. 
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| WITH TINY LETTERS 


, During the years of the Cold War, dramatic spy thrillers frequently portrayed the protagonists : 


sending detailed messages by way of a medium that was too small to read with the naked eye: 


~ microfilm. The technique was born several years before, during World War II, when German 


agents used a steganographic technique known as microdot. This consisted of a photograph 


: of a brief text reduced to the size of a full stop, which was then included as just one of many 


typographical symbols within an innocuous text. 


In the illustration below, the message (M) to be transmitted is:““A message encoded 
with a scytale”, but the unwound strip of paper displays the incomprehensible gib- 
berish (C): “tanh mca eos sdc sey adt gwa eil ete.” 


M = A MESSAGE ENCODED WITH A SCYTALE 
C = ANH MCA EOS SDC SEY ADT GWA EIL ETE 


Using a scytale employs a cryptographic technique known as transposition, where 
the letters in the message are reordered. To get an idea of the power of this method, 
consider the simple example of transposing just three letters: A, O and R.A quick 
test with no calculations necessary reveals that they can be reordered in up to six 
different ways: AOR, ARO, OAR, ORA, ROA and RAO. 

In abstract terms, the process is as follows: once one of the three pos- 
sible letters is placed first, allowing for three different arrangements, we are 
left with two letters that can in turn be reordered in two different ways for 
a new total of 3x2 = 6 arrangements. In the case of a somewhat longer mes- 


sage of, for example, 10 letters, the number of possible arrangements is now 
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A MANUAL FOR YOUNG LADIES 


The Kama Sutra is a lengthy manual that deals, among other things, with the knowledge that 


a woman needs in order to be a good wife. Written around the 4th century ac by the Brah- 


_ min Vatsyayana, it recommends up to 64 different skills, including music, cooking and chess. 


Number 45 is of particular interest to us, because it deals with the art of secret writing, or 


mlecchita-vikalpa. The learned author recommends several methods, including the following: 


divide the alphabet in half and pair the resulting letters at random. In this system, each pairing 


of letters represents a key. For example, one of them could be the following: 


To write the secret message one wouid just have to substitute every A in the original text with 


E, P with C, J with W, etc., and vice versa. 


10X9X8xX7xX6X5xX4x3x2x1.Such an operation is expressed by the mathemati- 


cal notation 10! and produces a total of 3,628,800. In general terms, for n number 
of letters, there are n! different ways to reorder them. So, a message of a modest 40 
letters would produce so many ways to reorder the letters that it would be practically 
impossible to decipher by hand. Have we perhaps found the perfect cryptographic 
method? | 

Not entirely. In effect, a random algorithm of transposition offers a higher level 
of security, but what is the key that allows it to be deciphered? The randomness of 
the process is both its strength and its weakness. Another encryption method was 
needed that would generate keys that were simple, easy to remember and to transmit; 
without sacrificing large amounts of security. So began the search for the perfect 


algorithm, and the first successes were achieved by the Roman emperors. 


To Caesar what is Caesar’s 


Veni, vidi, vici (I came, I saw, I conquered). 


Julius Caesar 


Substitution ciphers developed in parallel with transposition ciphers. Unlike trans- 


position, strict substitution exchanges one letter for another, or any type of symbol. 
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Unlike transposition, substitution does not draw on just the letters that appear in 
the message. In transposition, the letter changes its position, but maintains its role; 
the same letter has the same meaning in the original message and in the ciphered 
message. In substitution, the letter maintains its position but changes its role (the 
same letter or symbol has one meaning in the original message and another in the 
ciphered message). One of the first known substitution algorithms is the so-called 
Polybius cipher, in honour of the Greek historian Polybius (203—120 Bc), who left 
us a description of it. His method is developed in full in the Appendix. 

Approximately 50 years after the Polybius cipher, in the first century Bc, 
another substitution cipher appeared, known by the generic name of Caesar's cipher 
because Julius Caesar was one of its most infamous practitioners. Caesar's cipher 
is one of the best studied in the field of cryptography and it is extremely useful 
because it illustrates the principles of modular arithmetic, one of the mathematical 
foundations of writing in code. 

Caesar’s cipher operates by replacing each letter of the alphabet with anoth- 
er one from a fixed number of positions down the alphabet. According to the 
great historian Suetonius in his The Twelve Caesars, Julius Caesar coded his 
personal correspondence with a substitution algorithm of this type: each letter of 


the original message was substituted by another that followed three positions further 


GAIUS JULIUS CAESAR (100-44 Bc) 


Caesar (right) was a soldier and statesman whose dic- 
tatorship would end the Roman Republic. After serv- 
ing as magistrate in Hispania Ulterior, he joined two 
other powerful people of the period, Pompey and 
Crassus, and with them formed the First Triumvirate, 
validated by the marriage of Julia, Caesar’s daughter, 
to Pompey. The three divided up the Roman empire: 
Crassus got command of the eastern provinces, Pom- 
pey remained in Rome, and Caesar assumed the mili- 
tary command of Cisalpine Gaul and the Proconsul- 


ship of Narbonese Gaul. At this time, the war against the Gauls began. It lasted eight years 


and culminated in the Romans conquering Gaulish territory. From there, Caesar marched back 


to the imperial capital with his victorious legions and installed himself as sole dictator. 
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down the alphabet: the letter A was substituted by D, B by E, and so on. W became 
Z, and so X,Y and Z reverted to A, B and C. 


The encoding and decoding of a message encrypted in this way could be carried 


out with a simple device like the one below: 


Now we will examine the process in greater detail. In the table below, we see the 
starting alphabet and the transformation caused by Caesar’s cipher of substituting 
the letter three positions further down the alphabet of 26 letters (the upper row 


shows the original alphabet and the lower row shows the ciphered alphabet). 


Aleic{olelelaiH] ils {kiejm[wiolelalels|riulv [w(x] y{z| 
Deir io|Hf ifs] kjeiminjo}r fale} s|rjulv|w]x]y{zja}slc| 


FILM CODES 


In the classic science-fiction film 2001: A Space Odyssey (1968) directed by Stanley Kubrick 
| and based on a story by Arthur C. Clarke, a spacecraft’s supercomputer, called HAL 9000, is 
endowed with consciousness and becomes insane, attempting to kill the human crew. Now 
take Caesar’s Cipher with a key of B and treat the word “HAL” as a message encrypted with 
that code. We see that the letter H corresponds to 
the letter |; the A to the letter B, and the L to the let- 
ter M, in other words, “IBM”, at the time the largest 
computer manufacturer in the world. Was the film 
making a comment about the dangers of artificial 
intelligence or the pitfalls of unregulated commer- 


cial power? Or was it just a coincidence? 


The all-seeing eye of HAL 9000 from the film 
2001: A Space Odyssey. 
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When the two alphabets, the original (or plaintext) and the ciphered are arranged 
in this way, to encrypt any message it is simply a question of substituting the letters 
of one with those of the other. The key to the cipher is named after the letter that 
corresponds to the encrypted value for A (the first letter of the original alphabet). In 
this case, it is the letter D. The classic expression “AVE CAESAR” (“Hail Caesar’) 
would be encrypted as “DYH FDHVDU.” Conversely, if the ciphered message is 
“WUHH”, then the decrypted or plaintext message is “TREE.” In the case of the 
Caesar code just described, a cryptanalyst who had intercepted the message and 
knew the algorithm being used, but not the key, would have to try all possible re- 
orderings until he found a message that made sense. To do this he would have to 
explore, at the most, the total number of keys, or displacements. With an alphabet 


of n letters, n possible displacements produce n number of codes. 


16 = 4. Modular arithmetic and the mathematics 
of Caesar's cipher 


16 = 4? and 2 = 14? This is not.a mistake, nor is it some strange numbering 
system. The operation of a Caesar cipher can be formulated with a tool that is very 
common in mathematics and even more so in cryptography — modular arithmetic, 
sometimes called clock arithmetic. This technique had its origins in the work of 
the Greek mathematician Euclid (325-265 sc), and it is one of the fundamentals 
of modern information security. In this section, we will introduce the basic math- 


ematical concepts related to this particular type of arithmetic. 


THE FATHER OF ANALYTIC CRYPTOGRAPHY 


The main work of Euclid of Alexandria, the Elements, consists of 43 volumes that deal with : 


subjects such as plane geometry, proportions, the properties of numbers, irrational numbers, 
and the geometry of space. Although mostly associated with this last field, the works of the 
Greek mathematician relating to arithmetic operations on finite sets of numbers, or modules, 
constitutes one of the pillars of the formal study of modern cryptography. While known and 
admired by Arab scholars, the first modern European edition of the works of Euclid appeared : 
in Venice in 1482. It may not be coincidental that both the Arabs and the Venetians were great , 


masters of cryptography. 
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Take a classic analogue clock as an example and compare it with a digital one. 
The analogue distribution of the hours divides the circle into 12 parts that we will 
write as 0, 1, 2,3, 4,5, 6, 7, 8, 9, 10, 11. The equivalent numbering of pm hours 


between an analogue clock and a digital one can be seen in the following table. 


ol [2]2[*[s]*]7 [es on 
ra fis [ve [as [ie] 17] | 19 | 20] 21 [2a | 


When, for example, we say that it is “14:00” we are 


spent 


also saying that it is two o’clock in the afternoon. The 
same principle applies in the case of the measurement of 
angles. A 370° angle is equivalent to a 10° angle because you 
have to deduct a complete 360° turn from the first value. Note that 370 = (1-360)+ 10 
and also that 10 is the remainder when 370 is divided by 360. What angle is equiva- 
lent to 750°? Deducting the relevant complete turns we find that a 750° angle is 
equivalent to a 30° angle. We conclude that 750=2-360+30 and that 30 is the 
remainder of dividing 750 by 360.The mathematical notations for this are: 


750 = 30 (mod. 360). 


And we say that “750 is congruent with 30 modulus 360.” In the case of the clock, 
we would write 14 =2 (mod. 12). 

We could also imagine a clock with negative numbers. In this case, what time 
would it be when the hand of the clock points to —7? Or, in other words, what 
would —7 be congruent with in modulus 12? Let us calculate this remembering 


that the value “O” in our 12-part clock is equivalent to “12:” 


—7=-7+0=-74+12=5. 


caveat vones Went 14000 ae 


How to palculete 231 in modulus 7 with a calculator? 
First we divide 231 by 1 and we get 13. 58823529. | : 
Then we multiply the product, 13x1 i = 221. In this way v we do away with the decimals — 3 
all together. oo . S 
Finally we do the subtraction 231 221 = 10, thus obnaioies the remainder of the division. : . 
231 in modulus 17 is 10. This datum | is ica as 231 = 10 (mod 17). : - 
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The mathematics of the calculations with our analogue 12-part clock is called 
arithmetic in modulus 12. In general terms, we can say that a = b (mod m) if the 
remainder of the division between a and m is b, given that a, b and m are whole 
numbers. The number b is equivalent to the remainder of dividing a by m.The fol- 


lowing statements are equivalent 


a =b (mod. m) 

b =a (mod. m) 
a-—b=0 (mod. m) 
a~bisa multiple of m 


The question “What analogue time is 19 hours?” is equivalent in mathematical 
terms to the following question: “What is 19 congruent with in modulus 12?” To 


answer this question we have to solve the equation 


19 = x (mod. 12). 


Dividing 19 by 12 we get the quotient 1 and the remainder 7, so 
19 =7 (mod. 12). 


And in the case of 127 hours? We divide 127 by 12 and we get the quotient 10 
and the remainder 7, therefore 
127 =7 (mod. 12). 


To reiterate what we have learned so far, let’s examine the following operations 


in modulus 7 set out below: 


(1) 3+3=6 

(2) 34+14=3 

(3) 3x3=9=2 

(4) 5x4=20=6 

(5) 7=0 

(6) 35=0 

(7) —-44=-44+0=-4447x7=5 
(8) —33=-33+0=-33+5x7=2 
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(1) 6 is less than the modulus, and so is unchanged 


(2) 3+14 =17; 17:7 =14 and a remainder of 3 
(3) 3x3=9; 9:7=1and a remainder of 2 
(4) 5x4 = 20; 20:7 =2 and a remainder of 6 
(5) 7=7; 7:7=1 and a remainder of 0 
(6) 35 = 35; 35:7 =5 and a remainder of 0 
(7) -44=-44+0; —444+(7x7)=5 

(8) —33 =-33+0; —33+(5x7)=2 


MULTIPLICATION TABLE IN MODULUS 5 USING EXCEL 


A multiplication table in modulus 5 would look like this: | 


it is easy to formulate this and : other similar tables with only a modest knowledge 
of Excel spreadsheets. In the case of our example, the syntax of the Excel expres- 
sions on our computer (using our row and column positions) are shown below. The 
concept “remainder of dividing a number by 5” is translated into Excel. language by 
"= remainder (number;5).” The actual instruction for finding the product of 4 times 3 in modu- 
lus 5 would be, then, “=remainder(4*3;5)”, an operation that would give us the value 2. Such 


tables are very scadioe in as out modular arithmetic calculations. 


eee 
=REMAINDER(B$5*$A6;5)]=REMAINDER(C$5* $A6;5)]=REMAINDER(D$5* $A6;5)]=REMAINDER wear 5)|=REMAINDER(F$5*$A6;5)} 
ST Pa =REMAINDER(F$5*$A8;5) 
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What is the relationship between modular arithmetic and Caesar’s cipher? To 
answer the question we will set out a conventional alphabet and an alphabet with 
a displacement of 3 letters, to which we add a numerical header corresponding to 


the 26 characters. 


o}1{2/3]4]5|6[7|8|9[ro]11]12]13]14[15[16] 17] 18] 19]20]21 ]22]23[24]25 
Ale{ciolelrie|Hfifalk[e[min[olelalals|rfulv[w[x]y[z) 
olelr ictal i]s fk]e[w[wlofefafels|rfulv|w)x]y[z[afelc| 


We can see that the ciphered version of letter number x (in the plaintext alpha- 


bet) is the letter that occupies the position x + 3 (also in the plaintext alphabet). So 
it is important to find a transformation in which each numerical value is assigned 
the same value displaced by three units and take the result in modulus 26. Note that 


3 is the key of the cipher. So its function is defined as 


C(x) =x +3 (mod. 26), 


where x is the unciphered value and C(x) is the ciphered value. It is sufficient to 
substitute the letter by its numerical equivalent and apply the transformation. Let 


us take as an example the message “PLAY” and let us encode it. 


The P would be 15, C(15) = 15 + 3 = 18 (mod. 26), which corresponds to S. 
The L would be 11, C(11) = 11 + 3 = 14 (mod. 26), thus obtaining O. | 
The A would be 0, C(0) = 0 + 3 = 3 (mod. 26), thus obtaining D. 

The Y would be 24, C(24) = 24 + 3 = 27 =1 (mod. 26), thus obtaining B. 


The message “PLAY” ciphered in a key of 3 is “SODB” 


In general, if x indicates the position of the letter we wish to encode (0 for A, 1 
for B, etc.), the position of the ciphered letter [denoted by C(x)] will be expressed 
by the formula 


C(x) = (x + k) (mod. n) 


where n = the length of the alphabet (26 in the English alphabet) and k = the key, 


which transforms the ciphered message according to its value. 
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The deciphering of such a message involves the reverse calculations to ciphering it. 
In terms of our example, deciphering is equivalent to applying the inverse formula 


to the one used in ciphering: 


C-!(x) = (x—-k) (mod. n). 


In the case of the message ciphered “SODB”, with a Caesar cipher with a key 
of 3 in the English alphabet, k = 3 and n = 26, therefore 


C-1(x) = (x - 3) (mod. 26). 
The process is as follows: 


For S, x = 18, C"(18) = 18-3=15 (mod.26), which corresponds to P. 
For O, x = 14, C"'(14) = 14-3=11 (mod.26), by which we obtain L. 
For D, x = 3, C'(3) = 3-3=0 (mod.26), obtaining the A. 

For B, x = 1,C"(1) = -2+26=24 (mod.26), obtaining the Y. 


The message “SODB” ciphered in Caesar’s cipher with a key of 3 corresponds, 
as we already know, to the plaintext “PLAY.” 

To conclude this first foray into the mathematics of cryptography, we can es- 
tablish a new transformation, known as an affine cipher, which generalises Caesar's 


cipher. The transformation is defined as: 


Coy) (4) = (ax + b) (mod. n) 


with a and b being two whole numbers smaller than the number (n) of letters in 
the alphabet. The greatest common denominator (gcd) between a and n has to be 
1 [ gcd(a,n) = 1], because otherwise there would be the possibility of ciphering the 
same letter in different ways, as we shall see later on. The key of the cipher is de- 
termined by the pair (a,b). Caesar’s cipher with a key of 3 would, then, be an affine 
cipher with the values of a= 1 andb=3. 

The general affine ciphers like these offer greater security than a conventional 
Caesar cipher. Why? As we have seen; the key of an affine cipher is pairs of numbers 
(a,b). In the case of a message written in an alphabet of 26 letters and encrypted 
by means of an affine cipher, a and b can adopt any value between 0 and 25.The 
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GREATEST COMMON DENOMINATOR (GCD) 


The greatest common denominator of two numbers can be obtained with Euclid’s algorithm. 


This algorithm consists of dividing both numbers and then carrying out successive divisions 


between the preceding quotient and the new remainder. The process concludes when the 


remainder is 0, the divisor of the last division being the greatest common denominator of both 


numbers. For example, 


gcd(48,30)? 


~ 48 is divided by 30 and we get the remainder 18 and the quotient 1. 
30 is divided by 18 and we get the remainder 12 and the quotient 1. 


18 is divided by 12 and we get the remainder 6 and the quotient 1. 


12 is divided by 6 and we get the remainder 0 and the quotient 2. 


We have completed the algorithm. 


The gcd(48,30) is 6. 


If the gcd (a,n) = 1, we say that a and rn are coprime. 


Bezout's Identity, of great importance in cryptography, establishes that for two integers a and n 


larger than O, there are integers k and q such that gcd (a,n) = ka + nq. 


number of keys possible in this system of encryption with an alphabet of 26 letters 
is, therefore, 25x25 = 625. We observe that the number of keys for an alphabet of 
n letters is n times greater than that of Caesar’s cipher. The increase is considerable, 


but it is still susceptible to deciphering by brute force. 


Playing spies 


Under what conditions is it possible to decipher a message encrypted with an 
affine cipher, whether as the intended recipient or as a spy? We will explore this 


question using a simple example of a cipher for an alphabet of six letters: 


The text will be encrypted with the affine cipher C(x) = 2x + 1 (mod 6). 
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The A is ciphered according to C(0)=2x0+1=1 (mod. 6) , which corresponds to B. 
The B is ciphered according to C(1) = 2x 1+1= 3 (mod. 6), which corresponds to D. 
The C is ciphered according to C(2) = 2x 2+1=5 (mod. 6), which corresponds to F. 
The D is ciphered according toC(3) = 2x3+1=7=21 (mod. 6), which corresponds to B. 
The E is ciphered according to C(4) = 2x 4+1=9 = 3 (mod. 6), which corresponds to D. 
The F is ciphered according to C(5) = 2x 5+1=11=5 (mod. 6), which corresponds to EF 


The proposed affine cipher encrypts the messages “ABC” and “DEF” in the same 
way and the original message is lost. What has happened? 

If we work with a cipher expressed as C4) (*) = (ax +b) (mod. n), we can decipher 
the message unequivocally only if the gcd(a,n) = 1. In our example, gcd (2,6) =2 
and therefore fails this restriction. 

The mathematical operation of deciphering is equivalent to finding the 


unknown x given a numerical value y in modulus n. 


Ci,» (*) = (ax +b) = y (mod. n) 
(ax +b) = y (mod. n) 


ax = y—b (mod. n). 


In other words, we are seeking a value a"! (the inverse of a), which satisfies 
a‘a = 1, such that 
a~'ax = a~'(y — b) (mod. n) 


x = a7'(y—b) (mod. n). 


Consequently, to decipher successfully we have to calculate the inverse of a 
number a in modulus n and, in order to avoid wasting time, we need to know in 
advance if there really is such an inverse. An affine cipher ©,,,,) (* ) = (ax +b) (mod. n), 
will have an inverse if, and only if, the gcd(a,n) = 1. : 

In the case of the affine cipher in the example, C(x) = 2x +1 (mod. 6), we want 
to know if the number a, in our case 2, has an inverse. That is, if there is a whole 
number n smaller than 6 such that 2°” =1 (mod. 6). To do this we solve for all the 
values of the moduli (0,1,2,3,4,5): 


2-0=0, 2:1=2, 2-2=4, 2:3=6=0, 2:4=8=2, 2:5=10=4. 


There is no such value, from which we conclude that 2 does not have an inverse. 
In reality, we already knew this since gcd(2,6) # 1. 
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Let’s now assume that we have intercepted a coded message: “YSEMG”. We 
know that it has been encrypted with the affine cipher in the form of C(x) =2x+3 
and was originally written in Spanish with an alphabet of 27 letters (including an 
N following the regular N). What is the original message? First we calculate the 
gcd(2,27), which is equal to 1. The original message can be deciphered! To do so | 


we have to find the inverse function of C(x) = 2x + 3 in modulus 27: 


y=2x4+3 
2x=y—3. 


To isolate the x we have to multiply both sides of the equation by the inverse of 
2.The inverse of 2 in modulus 27 is a whole number n such that 2-n = 1 (mod. 27), 


that is 14, which we confirm: 


14-2=8=1. 
Consequently, 


x = 14(y-3). 


Now we can decipher the message: 
The letter Y occupies position 25 and deciphered it will be 14(25-3)=308 = 11 
(mod. 27). 
The letter that occupies position 11 in the alphabet is L. 
In the case of the letter S, 14(19 - 3) = 224 = 8 (mod. 27), which corresponds to the 
letter I. | 
In the case of F 14(5- 3) = 28 =1 (mod. 27), which corresponds to B. 
In the case of M, 14(12 — 3)= 126 = 18 (mod. 27), which corresponds to O. 

The deciphered message is the Spanish word “LIBRO” (meaning book). 


Beyond the affine cipher 


Various security systems were based for many centuries on Caesar’s idea and its 
generalisation in the form of the affine cipher. Nowadays any cipher in which each 
letter of the original message is substituted by another letter that has been shifted a 
fixed number of places (not necessarily three) is called Caesar’s cipher. 

One of the greatest virtues of a good encrypting algorithm is the ability to gener- 
ate a large quantity of keys. Both Caesar’s cipher and the affine cipher are vulnerable 


to cryptanalysis because the maximum number of keys is low. 
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If we eliminate any restriction regarding the order of the letters of the ciphered al- 
phabet, however, the potential number of keys increases markedly. The number of keys 
available to the standard 26-character (in any order) alphabet is 26! = 403,291,461,126, 
605,635,584,000,000, that is 403 septillion keys. A code-breaker investigating one 
potential key every second would take more than one billion times the expected 
life of the universe to exhaust all the possibilities! 


A possible code with a general substitution algorithm could be the following: 


(ofatetcfofelefeiu[is[k[e[minfolrlalris|rjuly iw] x] yz 
@ofwlela{r{yfulifofefalsfolefofw]s|«jelz|x}cjvja|njm 


Row (1) Plaintext alphabet. Row (2) Ciphered alphabet. 


The first six letters of the ciphered alphabet give a clue as to the selected ordering: 
it corresponds to the order of the letters on a keyboard that follows the QWERTY 
standard. To cipher Caeser’s famous comment “VENIVIDIVICI” (“I came, I saw, I 
conquered”) with the QWERTY code, for every letter of the conventional alphabet 


we look for the corresponding one in the ciphered alphabet. 


That would give us the following ciphered message: 


CTFO CORO COEO 


There is a very simple way to generate an almost inexhaustible number of codes 
that are easy to remember for this ciphering method. It is sufficient to agree on 
any keyword (it can even be a phrase) and place it at the beginning of the ciphered 
alphabet, allowing the rest of the alphabet to follow the conventional order starting 
with the last letter of the keyword, taking care not to repeat any letters. An exam- 
ple would be “JANUARY CIPHER”. First we would eliminate the space and the 
repeated letters, thus getting the keyword “JNUYCIPHE.” The resulting ciphered 
alphabet would be the following: 
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s|t]ulv[w| x[y[2| 


viclijpjujelria|x]ilmiolofels|r|v[wix[zfalelo 


The message “VENI VIDI VICI” would now be ciphered as “KCME XEYE 
XEUE” This system of generating codes can be arranged so that sender and receiver 
error are unlikely and it is simple to update. In our example, it would be enough 
to change the code each month — from JANUARY CIPHER to FEBRUARY CI- 
PHER and from there to MARCH CIPHER etc. — without the communicators 
having to speak to each other after the code was established. 

The reliability and simplicity of the keyword substitution algorithm made it 
the preferred encrypting system for many centuries. During that time the general 
consensus was that the cryptographers had the upper hand over the cryptanalysts. 


Medieval cryptanalysts believed they saw ciphers in the Old Testament, and they were not mis- 


/ taken. There are several fragments of sacred texts that are encrypted with a substitution cipher 
called Atbash. This cipher consists in substituting any letter (n) for the letter that is the same 
: distance from the end of the alphabet as n is from the beginning. For example, in our alphabet, 
the A jis substituted by Z,B ae 3 
: by Y, etc. in the case of the © 


original Old Testament the 


substitutions are carried out 
with the letters of the He- — 


~ brew alphabet. So in Jeremi- 
ah (25,26) the word “Babel” — 
is ciphered as “Sheshakh.” 


A Hebrew Bible from the 
early 18th century. 
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Frequency analysis 


The Koran is composed of 114 chapters, each of which corresponds to one of the 
Prophet Muhammad’s revelations. These revelations were written down during 
the life of the Prophet by various companions and later collected by Abu Bakr, 
the first caliph. Umar and Uthman, the second and third caliphs respectively, 
completed the project. The fragmentary nature of the original writings encouraged 
the birth of a branch of theology devoted to the exact dating of the different revel- 
ations. Among other dating techniques, Koranic scholars compiled the frequency 
of the appearance of certain words considered to be newly coined throughout the 
writing period. If a revelation contained enough of these newer words, it was 


reasonable to conclude that it was a comparatively late revelation. 


14th century Koran manuscript. 


This initiative turned out to be the first specific cryptanalysis tool ever invented: 
frequency analysis. The first person to leave a written record of this revolutionary 
technique was a philosopher by the name of Al-Kindi, who was born in Baghdad 
in the year 801. Although he was an astronomer, doctor, mathematician and lin- 
guist, the occupation for which he is most remembered is that of cryptanalyst. If he 


was not the first, Al-Kindi was certainly the most important one in history. 
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Very little was known about Al-Kindi’s pioneering role until relatively recently. 
In 1987, a copy of a treatise of his entitled On Deciphering Cryptographic Messages 
surfaced in an archive in Istanbul. This contains a very succinct precis of the ground- 


breaking technique: 


“One way to decode a ciphered message, if we know in what language it is 
written, is to find a plaintext written in the same language that is sufficiently 
long, and then count how many times each letter appears. The letter that ap- 
pears with the most frequency we will call the “first,” the next most frequent 
we will call “second”... and so on until we have covered all the letters that 
appear in our text. Then we observe the coded text that we are deciphering 
and we classify its symbols in the same manner. We find the symbol that ap- 
pears with the most frequency, and we substitute it with the “first” from our 
text, we do the same with the “second” and so on, until we have covered all 


the symbols of the cryptogram we are deciphering.” 


In earlier pages, he mentions that in the substitution cipher method, each letter 
of the original message “maintains its position but changes its role,” and it is precisely 
this constancy of “maintaining the position” that makes it susceptible to frequency 
cryptanalysis. Al-Kindi’s genius reversed the balance of power between cryptogra- 


phers and cryptanalysts, swinging it, for a time at least, toward the eavesdroppers. 


A detailed example 


From greatest to least frequent, this is how letters are used in English texts: ET A O 
INSHRDLCUMWFGY PBVKJ X QZ.The percentage of appearances 
made by each letter is shown in the following frequency table. 


8.17% V 0.98% 
1.49% W 2.36% 
2.78% X 0.15% 


4.25% 7. 1.97% 
12.70% -.- GRF% 

2.29% 

2.02% 
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If a message has been ciphered with a substitution algorithm like the ones dis- 
cussed earlier, it is open to being decoded according to the relative frequency of 
the letters of the original message. It is enough to count the appearance of each of 
the ciphered letters and compare them to the frequency table of the language in 
which it was written. So, if the letter that appears most often in the ciphertext is, 
for example, J, the letter of the original message to which it most likely corresponds 
would be, in the case of English, an E. If the second most frequent letter is Z, the 
same reasoning would lead us to conclude that T is the most likely correspond- 
ing letter. The process is repeated for all the letters of the ciphertext and thus the 
cryptanalysis is complete. 

Obviously the frequency method cannot always be applied so directly. The fre- 
quencies of the previous table are correct only on average. Short texts such as“ Visit 
the zoo kiosk for quiz tickets” have a relative frequency of letters that is very different 


to that which characterises the language as a whole. In effect, in texts of less than 


SHERLOCK HOLMES, CRYPTANALYST 


Deciphering by frequency analysis is a very dramatic technique that has attracted the atten- 
tion of a large number of authors. Perhaps the most famous story based on the cryptanalysis 
of a message is The Gold-Bug, written by Edgar Allan Poe in 1843. The Appendix contains a 
detailed account of the fictional message encrypted by Poe and its flawless solution using fre- 
quency analysis. Other narrators such as Jules Verne and Arthur Conan Doyle used similar de- 
vices to add suspense to their story lines. In The Adventure of the Dancing Men, Conan Doyle — 
confronts his creation Sherlock Holmes with a substitution cipher that forces the detective to 
turn to frequency analysis. More than 1,000 years later Al-Kindi’s idea was still able to enthral 


everyday people with its ingenuity. 


SIX EX ANSE IANS 


The first of the coded messages that Sherlock Holmes must decipher in The Adventure of the 
Dancing Men, which we will not decipher here so as not to spoil the story for future readers 
of the book. Suffice it to say that the small flags raised by the dancing figures constitute an 
important element of the cipher. : 
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100 characters this simple analysis is rarely useful. Frequency analysis, however, is 
not limited to the study of letters on their own. Although we agree that it is often 
unlikely that the most frequent letter in a short ciphertext is E, we can be more 
certain that the five most frequent letters are probably A, E, I, O and T, without 
knowing which corresponds to which. A and I never appear in pairs in English, 
while the other letters can. Moreover, it is also likely that, however short the text, 
the vowels tend to appear in front of and behind clusters of other letters, while the 
consonants tend to group with vowels or with small numbers of letters. In this way, 
we can perhaps differentiate the T from the A, E, I and the O. As we successfully 
decipher some letters, words will appear where we only need to decipher one or two 
characters, which will allow us to pose hypotheses on the identity of those letters. 


The speed with which we can decipher increases as we decipher more letters. 


The polyalphabetic cipher 


On February 8, 1587, Mary, Queen of Scots, was beheaded at Fotheringhay Cas- 
tle after being found guilty of treason. The judicial proceedings leading to such a 
drastic sentence had demonstrated beyond doubt that Mary had been colluding 
with a group of Catholic aristocrats, headed by the young Anthony Babington, in 
a plan to assassinate Queen Elizabeth I of England and install Mary at the head 
of a Catholic kingdom encompassing both England and Scotland. The decisive 
evidence was offered by Elizabeth’s counterespionage service, headed by Lord 
Walsingham.It was comprised ofaseries ofletters between Mary and Babington which 
clearly stated that the young Scottish queen knew about the deadly plan and approved 
of it. The letters in question were ciphered with an algorithm that combined ciphers 
and codes. In other words, not only did it exchange letters with other characters, but 
it also employed unique symbols to refer to certain words of common usage. Mary’s 


ciphered alphabet appears below: 


m 


I 


a.b.¢.d.4.f.¢ & 4.6 4 0-0 SG BK YZ 
OF AHaACHFoIignll PS MFAECTS 
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Except for the fact that it used symbols instead of letters, Mary's ciphered al- 
phabet is no different to any other used for centuries by cryptographers all over the 
world. The young queen and her conspirators were convinced that the cipher was 
secure but, unfortunately for her, Elizabeth’s best cryptanalyst, Thomas Phelippes, 
was an expert in frequency analysis and was able to decipher Mary’s letters with 
little difficulty. The thwarting of what came to be known as the Babington Plot sent 
a powerful signal to the governments and agents of all Europe: the conventional 


substitution algorithm was no longer secure. The cryptographers appeared impotent 


in the face of the power of the new deciphering tools. 


A fragment of one of Mary, Queen of Scots’ letters to the conspirator Anthony Babington, 
which would eventually condemn her to death. 


Alberti's contribution 


However, a solution to the problem posed by frequency analysis had been found 
more than a century before Mary’s head was put on the block. The architect of 
the new cipher was none other than the multi-talented Renaissance scholar Leon 
Battista Alberti. Generally better known as an architect and mathematician who 
made great leaps forward in the study of perspective, in 1460 Alberti devised a sys- 
tem of encryption that consisted of adding a second ciphered alphabet to the first 


one as shown in the following table: 
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(ojaleicloleleial a] ils |k]e[m[wlolelafals|rfulv|w]x[v[z 
Jojojeleic|H[i [i [«[e[m[nfolefalels|riulv|wix]y|[z[alalc 
am) nfefv}c}xyzielys[Hiolr[olsfafelofifuly[rfafe|wio 


Row (1) Plaintext alphabet. Row (2) Ciphered alphabet 1. Row (3) Ciphered alphabet 2. 


To encrypt any message whatsoever, Alberti proposed alternating the two ci- 
phered alphabets. For example, in the case of the word “SHEEP.” the cipher for 
the first letter would be found in the first alphabet (V), that of the second in the 
second (L), and so on. In our example, “SHEEP” would be ciphered as “VLHCS.” 
The advantage of this polyalphabetic encryption algorithm, in comparison with the 
previous ones, is evident straight away — the double E from the plaintext is ciphered 
in two different ways, H and C. To further confuse any cryptanalyst faced with 
the encrypted text, the same ciphered letter represents two different letters in the 
plaintext. Frequency analysis, therefore, lost a large part of its usefulness. Alberti 
never formally set out his idea in a treatise, and the cipher was later developed in- 
dependently at more or less the same time by two academics, the German Johannes 


Trithemius and the French Blaise de Vigenére. 


De Vigenére’s square 


In Caesar's cipher, a monoalphabetic cipher is used; a single ciphered alphabet 
corresponds to the plaintext alphabet such that the same ciphered letter always 
corresponds to the same plaintext letter. (In the classic Caesar cipher, D is always 
an A, E is B, and so on). 

In a polyalphabetic cipher, on the other hand, a particular letter in a mes- 
sage can be assigned as many letters as the number of ciphered alphabets used. 
To encrypt a text, a different ciphered alphabet is used as one goes from one 
letter of the plaintext alphabet to the next. The first and most famous polyal- 
phabetic cipher system is known as De Vigenére’s square. His table of alphabets 
consisted of a plaintext alphabet of n letters below which appeared n ciphered 
alphabets, each one shifted cyclically by one letter to the left in comparison to 
the previous alphabet above. In other words, a square matrix of 26 rows and 26 


columns arranged as shown on the next page. 
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Note the symmetry in the correspondence of the letters. The pair (A,R) = (R,A), 


and this same relationship applies to all the letters. 


4 


We can immediately see that De Vigenére’s square consists of a plaintext alphabet 

of n letters each one of which is transformed according to increasing parameters. 

So the first ciphered alphabet would serve to apply a Caesar cipher with the pa- 

rameters a = 1 and b = 2; the second would be equivalent to a Caesar cipher with 

| b = 3, etc. The key to De Vigenére’s square consists of knowing which letters of the 
message are ciphered and how many rows down we go to find the corresponding 

ciphered letter. The simplest key consists of moving down one row for every letter 


of the original message. 
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: A practical way to implement a polyalphabetic cipher is to use a device known as an Alberti ce 
cipher disk. These portable ciphers consist of two concentric disks, a fixed one with a conven- _ 
tional alphabet engraved on it, and a moveable one with another alphabet engraved on it. The 
: sender can, by rotating the moveable ring, match the plaintext alphabet with as many ciphered 
| | alphabets as there are turns on the ring up to a maximum equal to the number of letters of 
: the; alphabet being used. The cipher obtained from an Alberti disk is very resistant to frequency 
analysis. To decrypt the message, the recipient only has to make the same number of turns as 
| the sender. The security of this cipher, as always, depends on keeping the codes secret, that is, 
the arrangement of the alphabet on the moveable ring plus the > 
number of turns effected. An Alberti disk with a single move- | : 
- able ring engraved with a traditional alphabet allows fora ff 
: ‘Caeser cipher at every turn. ‘Similar devices were used ine 
conflicts as recent as the American Civil War, and today — 
ey can be found in children‘ 5 spy games. 


‘An Alberti disk used by the Coniecierate 
in the American Civil War. 


So our classic phrase “VENI VIDI VICI” would be ciphered as follows: 


To cipher the first V, we find the corresponding letter in row 2: W. 
To cipher the E, we find the corresponding letter in row 3: G. 
To cipher the N, we find the corresponding letter in row 4: Q. 
I (row 5):M 

V (row 6):A 

I (row 7):O 

D (row 8): K 

I (row 9): Q 

V (row 10): E 

I (row 11):$S 

C (row 12):N 

I (row 13):U 
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DIPLOMAT AND CRYPTOGRAPHER 


Blaise de Vigenére was born in France in 1523. In 1549, 
he was sent by the French government on a diplo- 
matic mission to Rome, where he became interested 
in cryptography and ciphered messages. In 1585, he 
wrote his seminal work, Traicté des Chiffres (Treatise on 
Ciphers), which describes the system of encryption to 
which he gave his name. This cipher system was unas- 
sailable for almost two centuries, until the Briton Charles 
Babbage succeeded in deciphering it in 1854. Curiously 
enough, this fact was not known until some time into 
the 20th century, when a group of scholars examined 


Babbage's personal notes and calculations. 


The original encrypted phrase would become “WGQM AOKQ ESNU.” As 
can be immediately verified, the repeated letters in the original message disappear. 
However, every cryptographer’s concern is to generate codes that are easy to remem- 
ber, to distribute and to update. Keywords that had the same or fewer numbers of 
letter as the message being deciphered were used to generate shorter, easier to use 
De Vigenére’s squares. The keyword formed the first letters in each row (see page 
47), followed by the rest of the alphabet (as they appear in the full square). Then the 
keyword was written below the plaintext, repeating as often as is necessary. Then the 
letter in the keyword below each of the plaintext characters directs the cryptographer 


to the row in the square from which the ciphered letter is to be taken. 


For example, if we wish to cipher the message:““BUY MILK TODAY” by means 
of the keyword “JACKSON”: 


Orsiaimesese| © | U[¥[m[1]«[*]t]o]o]aly 
P tewod [i falele{s}olwlsfalc|x|s 
feoheeamesce| x [ul alwialz|x{clolr|«la 


The ciphered message is “KUAWAZXCOFKQ.” 
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De Vigeneére’s square with the rows defined by the keyword JACKSON. 


_ As in the case of all classical encryption systems, the deciphered message of a 
text encrypted using De Vigenére’s square is symmetrical to the ciphered message. 
For example, for the case of the message ciphered “WZPKGIMQHQ” with a 
keyword of “WINDY”: 


Let’s look at the first column. We are seeking to solve the unknown “?” given 
that (?,W) = W.To do this we look along the W row in the De Vigenére’s square 


on page 44 until the W appears and we see which column it corresponds to; the 
answer is A. Next, we look for a letter “?” that verifies that (?,I) = Z and we get R, 
and so on. The original message is revealed as “ARCHIMEDES.” 

The historical importance of De Vigenére’s square, which it shares in general 
with other polyalphabetic ciphers such as Gronsfeld’s (developed at a similar time 
and explained in detail in the Appendix), is its resistance to frequency analysis. If the 
same letter could be ciphered in more than one way without making it impossible 
to decipher it subsequently, how could effective cryptanalysis be carried out? The 
question would remain unanswered for more than 300 years. 


Classifying alphabets 


Although it took almost eight centuries, the polyalphabetic ciphers such as De 
Vigeneére’s square finally outwitted frequency analysis. Curiously, monoalphabetic 
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systems, despite their weaknesses, had the advantage of being very simple to 1m- 
plement. Cryptographers devoted themselves to refining the procedures and to 
filling their algorithms with tricks, but fundamentally they kept on using the same 
concepts as the simplest ciphers. 

One of the most successful variants of the monoalphabetic system was that 
known as the homophonic substitution cipher, which attempted to frustrate poten- 
tial attacks using statistical cryptanalysis by increasing the substitution rates of the 
letters with the greatest frequency of appearance. So, if the letter E represented, on 
average, 10 per cent of a text in any language, a homophonic substitution cipher 
attempted to alter the frequency by replacing the E with 10 alternative characters. 


Such methods were remained in favour until well into the 18th century. 


THE CRYPTOGRAPHERS OF THE SUN KING 


Although few outside the court of Louis XIV knew of their existence, the brothers Antoine 
and Bonaventure Rossignol were two of the most feared men in Europe during the upheav- 
als of the 17th century. Their ability to decipher messages of the enemies of France (and 


of the personal enemies of the monarch) was matched by their inventiveness as cryptogra- 


phers. They developed the Grande Chiffre (Great Cipher), a complex algorithm of syllable 


substitution used to encrypt the king's 
most important messages. When the 
brothers died, however, the cipher fell 
out of use and became unbreakable. 
Not until 1890 did a cryptography 
expert, the retired soldier Etienne Baz- 
eries, take on the arduous task of de- 
crypting the ciphered documents and, 
following years of hard-work, became 
the unsuspecting recipient of the Sun 


King’s secret messages. 


Louis XIV in a portrait by Mignard. 
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Things were to move on, though. The emergence of the great nation states and 
their accompanying diplomacy generated a marked increase in the demand for 
secure communication. This tendency was further reinforced by the appearance of 
new communication technologies, such as the telegraph, which boosted the volume 
of communications massively. The European nations established so-called “black 
rooms’’, nerve centres of activity from which the most delicate communications 
were coded and where enemy intercepts were deciphered. The expert work of the 
black rooms soon made any form of monoalphabetic substitution insecure, however 
modified it might be. Little by little the great players in the game of information 
exchange were opting for polyalphabetic algorithms. Having lost their most power- 
ful weapon, frequency analysis, the cryptanalysts were once again left defenceless in 


the face of the cryptographers’ onslaught. 


The anonymous cryptanalyst 
The British mathematician Charles Babbage (1791-1871) was one of the most 


extraordinary scientific figures of the 19th century. He invented an early mechani- 
cal computer called the difference engine that was way ahead of its time, and his 
interests spanned all the mathematics and technology of the age. Babbage decided 
to apply his intellect to deciphering polyalphabetic algorithms, with De Vigenére’s 
square (see pages 44 and 47) as his prime target. He focused his attention on one 
characteristic of this cipher. We should recall that, in the case of De Vigeneére’s 
cipher, the length of the chosen keyword determined the number of ciphered 
alphabets in use. So, if the keyword were “WALK,” each letter of the original mes- 
sage could be ciphered in up to 4 different ways. The same would be true of the 
words.'This characteristic would be the toehold from which Babbage would begin 
to climb the wall of the polyalphabetic cipher. Let’s look at the following example 


of a message ciphered with De Vigenére’s square. 


original messooe] © |v [t[a[n[o[O[ ele |¥[s]ela. 
Coneresmenase| xt v[w[«]1[o]2]8]xlylololw 


What immediately draws our attention is that the word “BY” of the original 


message is ciphered with the same letters in both cases, XY. This is due to the fact 
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that the second BY occurs after eight characters and eight is a multiple of the 
number of letters (four) in the keyword (WALK). With this information, and given 
a sufficiently long original text, it is possible to guess the length of the keyword. 
The procedure is as follows: you list all the repeated characters and note after how 
many characters they repeat. Then you seek whole divisors of these latter numbers. 
The common divisors are the numbers that are candidates to represent the length 
of the keyword. 

Let’s assume that the most probable candidate is 5 because that is the common 
divisor that appears most often. Now we have to guess what letters each of the 
five letters of the keyword correspond to. If we recall the encryption process, each 
letter of the keyword in De Vigenére’s square establishes a monoalphabetic cipher 
of the corresponding letter in the original message. In the case of our hypothetical 
five-letter keyword (C1, C2, C3, C4, C5), the sixth letter (C6) is ciphered with 
the same alphabet with which the first letter (C1) was ciphered, the seventh (C7) 


A working section of 
Babbage’s difference 
engine, built in 1991 
according to the plans 
left by its inventor. The 
device allows the ap- 
proximation of logarith- 
mic and trigonometric 
functions and, there- 
fore, the calculations 
of astronomical tables. 
Babbage did not see it 
built in his lifetime. 
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with that used to cipher the second (C2), etc. Therefore, what the cryptanalyst is 
actually dealing with is five separate monoalphabetic ciphers, each one of which is 
vulnerable to traditional cryptanalysis. 3 

The process is concluded by designing a frequency table for each of the letters in 
the ciphered text with the same letters as the keyword (C1, C6, C11... and C2, C7, 
C12... until you have the five groups of letters that make up the total length of the 
message. Then compare these tables with a frequency table of the language of the 
plaintext message in order to decipher the keyword. If the two data sets do not appear 
to coincide, we start again with the second most probable length of keyword. This 
time we identify at least one probable keyword, so all that is left to do is decipher 
the message. By this method, the polyalphabetic code was broken. 

Babbage’s astounding exercise, completed around 1854, would, nonetheless, 
remain in obscurity. The eccentric British intellectual never published his discovery, 
and only recent reviews of his notes have led us to identify him as the pioneer of 
deciphering polyalphabetic keywords. Fortunately for cryptanalysts the whole world 
over, a few years later, in 1863, the Prussian officer Friedrich Kasiski published a 
similar method. | 

Irrespective of who was the first to break it, the polyalphabetic cipher had ceased 
to be impregnable. From this moment on, the strength of a cipher was going to 
depend less on great algorithmic innovations of encryption and more on increasing 
the number of potential ciphered alphabets, which would have to be so large as to 
make frequency analysis and its variants completely unfeasible. A parallel objective 
was to find ways of speeding up cryptanalysis. Both fields of enquiry converged 


toward the same point and gave birth to the same process: computerisation. 
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Coding machines 


The 19th century would expand the usefulness of codes way beyond sending se- 
cret messages. The development of the telegraph in the first third of the century 
and, thirty years later, the development of the two-way telegraph by Thomas Alva 
Edison, revolutionised communications and, consequently, the world. Since the 
telegraph functioned by electrical impulses, it was necessary to implement a sys- 
tem that would translate the content of the messages to a language that a machine 
could express — and transmit. In other words, a code was needed. From among the 
various proposals, a system of dots and dashes invented by the American artist and 
inventor Samuel FE B. Morse prevailed. Morse code can be considered a predecessor 
of the codes that, many decades later, are used indirectly by us all to enter data into 


computers and get information back out of them. 


Morse code 


Morse code represents the letters of the alphabet, numbers and other signs by a 
combination of dots, dashes and spaces. In this way, it translates the alphabet into 
something that can be expressed by means of simple signals of light, sound or elec- 
tricity. Each dot represents a single time unit of approximately 1/25th of a second; 
a dash is three units long (equivalent to three dots). The spaces between the letters 
are also three units long, and five units are used as the spaces between words. 

At first, Morse was denied a patent on his code in the United States and in 
Europe. Finally, in 1843, he obtained government financing for the construction of 
a telegraph line between Washington DC and Baltimore. In 1844, the first coded 
transmission was performed, and shortly after a company was formed with the 
express purpose of covering the whole of North America with telegraph lines. By 
1860, when Napoleon III awarded Morse the Legion of Honour, the United States 
and Europe were already criss-crossed by his telegraph wires. At Morse’s death in 
1872, America had more than 300,000 kilometres of cable. 

At first, a sumple device, invented in 1844 by Morse himself, was used to send 


and receive telegraph messages. The device consisted of a telegraph key that served 
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NON-VERBAL COMMUNICATION 


¥ 


Because he had hearing problems, Thomas Alva Edison (1847-1931) communicated with his 
wife, Mary Stilwell, by means of Morse code. During their courtship, Edison proposed marriage 
by tapping lightly with his hand, and she replied in the same way. The telegraphic code then be- 
came a common means of communication for the couple, to the point that when they went to 
see a play at the theatre, Edison placed Mary’s hand on his knee so that she could “telegraph” 
him the dialogue of the actors. | oe a 


to connect and disconnect the electric current, and an electromagnet that received 
the incoming signals. Every time the key was pressed down — generally with the 
index or middle fingers — an electrical contact was established. Intermittent impulses 
produced by tapping the telegraphic key were transmitted to a cable composed of 
two copper wires. These wires, supported by tall wooden “telegraph” poles, con- 


nected the nation’s different telegraph stations and often extended hundreds of 


kilometres without interruption. 


First telegraph machine designed by Samuel Morse in 1844. 
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SYMPHONY IN V MAJOR 


Beethoven is another famous deaf person associated with the telegraph, although in his case, 


only rather indirectly: the first four chords of the brilliant composer's Fifth Symphony have a 


| rhythm reminiscent of a message in Morse code: “dot dot dot dash.” 


In Morse code, dot dot dot dash corresponds to the letter V, the first letter of the word victory. 
Because of this, the BBC used Beethoven’s Fifth as the opening theme for its broadcasts to oc- 


cupied Europe during the World War II. 


The receiver contained an electromagnet, formed from a coil of copper wire 
wrapped around an iron core. When the coil received the impulses of the electric 
current that corresponded to the dots and dashes, the iron core became magnetised 
and attracted a moving part, also made of iron. That produced a distinctive sound 
when striking the magnet. This sound was a short “click” when a dot was received, 
and a longer note when a dash was received. Initially, sending a telegram with such 
a device required a human operator to tap out the codified version of the message 
at one end, and someone else to receive and decipher it at the other. 

The translation of the conventional characters of Morse code was done accord- 


ing the following table: 
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So the message “I love you” would be coded as: 


As mentioned before, Morse code was, in a way, the first version of future 
digital communication systems.To demonstrate this idea, we could happily convert 
Morse into numbers, assigning a 1 to the dot and a 0 to the dash. Such strings of 1 
and 0 will become more familiar in later chapters. 

In the 20th century, traditional telegraphy was replaced by wireless communica- 
tion driven by the invention of the radio. The telegraphists of yesteryear became 
radio operators. This new technology meant messages could be sent at even higher 
speeds and in bulk. However, messages sent as electromagnetic waves were relatively 
easy to intercept. This provided cryptanalysts with large quantities of ciphered mate- 
rial to work on and helped to consolidate their dominant position in the battle with 
cryptographers, given that the majority of ciphers used by governments and private 
agencies, even the most sensitive, were based on known algorithms. This was the 
case of the Playfair cipher-for example, which was invented by the Britons Baron 
Lyon Playfair and Sir Charles Wheatstone. The Playfair cipher was an ingenious 
variation on Polybius’ cipher, but in the end only a variation — the cipher is set out 
in detail in the Appendix. 

Despite the considerable inventiveness of their creators, the decryption of these 
recycled ciphers was ultimately a question of time and computing capacity. The 


cryptographic history of World War I illustrates this perfectly. We have already heard 


SAVE ouR SOULs, SHIP OR ANYTHING & ELSE BEGINNING wit 7 a 


The most famous sighal | in Morse Code is SOS. It was established 2 as a distress call by a group of Se 
European countries because of the simplicity of its transmission (three dots, three dashes, three : : 
dots) - no meaning was attached toit. However, people were soon giving the signal alternative : 
meanings. The most famous of these ” backronyms e was Save Our Souls. Later, as the signal was 2 


frequently used at sea, SOS also became referred to popularly as Save Our Ship. , 


CODING MACHINES 


about the weakness of the German diplomatic cipher during the Zimmermann 
telegram incident. What the Germans themselves didn’t suspect was that another 
of their common ciphers, known as ADFGVX and used to encrypt the most sensi- 
tive messages destined for the front, could also be solved by enemy cryptanalysts 
despite its supposed invulnerability. This double failure of Germany’s World War I 
codes made all sides aware of the need to cipher more securely. This objective was 


to be achieved by making cryptanalysis more difficult. 


80 kilometres from Paris 


In June, 1918, German troops were preparing to attack the French capital. It was 
_essential to the Allies to intercept enemy communications to find out where the 
offensive incursions would take place. The German messages destined for the front 
were encrypted with the ADFGVX cipher, considered by the German military to 
be unbreakable. 

Our interest in this cipher stems from the fact that it combines substitution. and 
transposition algorithms. It is one of the most sophisticated methods of classical 
cryptography. Introduced by the Germans in March 1918, no sooner did the French 
learn of its existence than they frantically applied themselves to breaking the code. 
Luckily for them, a talented cryptanalyst called Georges Painvin was working in the 
central cipher bureau. He devoted himself to the task day and night. The night of 
June 2, 1918, Painvin succeeded in deciphering a first message. The ominous content 
was an order directed to the front:“Rush munitions. Even by day if not seen”’The 
introduction to the cipher indicated that it had been sent from some place located 
between Montdidier and Compiégne, some 80 kilometres north of Paris. Painvin’s 
achievement allowed the French to foil the attack and halt the German advance. 

As mentioned already, the ADFGVX cipher consists of two parts: a substitution 
and a transposition. In the first phase — substitution — we have a seven-by-seven 
grid in which the first row and the first column each contain the letters ADFGVX 

_ (see page 58). The remaining squares of the grid are randomly filled in with 36 
characters: the 26 letters of the alphabet and the numbers 0 to 9.The arrangement 
of the characters constitutes the key to the cipher, and the recipient, clearly, needs 


this information to understand the content of the message. 
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Let’s use the following base table: 


The cipher consists of translating each character of the message into coordinates 


using the letters from the group ADFGVX. The first coordinate is the letter that 
corresponds to the row, and in the second one corresponds to the column. For 
example, if we wished to cipher the number 4, we would write “DV.” The message 


“Target is Paris” would be ciphered as follows: 


rom fo foof elo tw lee la to Lo 


Up to this point we are dealing with a simple substitution, and frequency analysis 


would be sufficient to decipher the message. 

The cipher, however, contains a second phase — transposition. The transposition 
depends on a keyword agreed upon by the sender and the receiver. This phase of the 
cipher is carried out as follows. First, we construct a grid with as many columns as 
there are letters in the keyword, and we fill in the cells with the ciphered text. The 
letters of the keyword are written in the top row of the new grid. In this example, 
the keyword will be BETA. We create a new table in which the first row consists 
of the keyword and the following rows contain the letters obtained by encoding the 
message through substitution. Any empty cells are filled in with the number zero 


which, as we see from the first table, is symbolised by AG. 
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So to. apply this second process to our message “Target is Paris”, we first recall 
that the substitution cipher produced was: 


We continue with the transposition cipher and change the position of the col- 
umns, so the letters of the key are arranged in alphabetical order. This gives us the 
following table. 
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The ciphered message is produced by taking the letters of the grid by columns. 


In the example, we get: 
AAXFAXGGFGVAFVXVVXDVFFDGVFVA 


As we can see, the message consists of an apparently random mix of the letters A, 
D,EG,V and X.The Germans selected these six letters because they sounded very 
different to each other when sent in Morse code.This helped the receiver to detect 
hypothetical transmission errors more easily. Moreover, since it consisted of only six 
letters, the telegraphic transmission was simple and therefore easy for inexperienced 
operators to send. 

If we turn to the Morse code table at the beginning of the chapter, we can see 
that the codes for each of the letters of the cipher ADFGVX< are as follows: 


“<a aATY > 
| 
| 


The receiver only needs the random distribution of the letters and numbers 
shown by the base table and the second keyword to reverse the encryption and 


reveal the message. 


The Enigma machine 


In 1919, the German engineer Arthur Scherbius patented a machine that was 
designed to produce completely secure communications. Its name, Enigma, has 
since become synonymous with military secrecy. For all its apparent sophistication, 


Enigma is, in essence, an improved version of Alberti’s disk, as we shall see below. 
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Because it was relatively easy to use and because of the complexity of the result- 
ing cipher, Enigma was the system selected by the German government to encrypt 
a large part of its military communications during World War II. 

As a result, deciphering the Enigma code became an absolute priority for the 
governments confronting Nazi Germany. When they finally succeeded, the messages 
intercepted and deciphered by Allied intelligence proved to be decisive in bring- 
ing an end to the conflict. The history of the deciphering of the Enigma code is a 
fascinating story that involved, in the main, the departments of intelligence of both 
Poland and the United Kingdom, and includes among its heroes the mathematical 
genius Alan Turing, the man considered to be the father of modern computing. 
The battle to break the Enigma code also yielded the first digital computer in his- 
tory, and can be considered the most spectacular episode in the long and colourful 


history of military cryptanalysis. 


. spite St 
init stee Sapeee St. 
: seeks gone FB : 


Above left: German soldiers transcribe a ciphered message with an Enigma machine during 
World War Il. Above right: a replica four-rotored Enigma machine. 
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The Enigma machine itself was an electromagnetic device similar in appearance 
to a typewriter. What made it so special was that its mechanical components changed 
position with each key press so that even if the same plaintext letter was pressed 
consecutively, it would most likely be encoded differently each time. 

The physical process of ciphering was relatively simple. First, the sender arranged 
the machine’s various plugs and rotors according to a starting point specified by the 
particular code book in force at the time (code books were changed regularly). Then 
he would type the first letter of the plaintext, and the machine would automatically 
generate an alternative letter that would appear on an illuminated panel — the first 


letter of the ciphered message. 


TRENCH CODES 


in battle, using complex ciphers like ADFGVX is very hard work. In the Spanish Civil War (1936- 


1939), for example, there were many simpler substitution algorithms, such as the following: 
A | 2 | cto 
fw ts Pot 


As we Can see, several eters have more than one Ss version. The R, for emis < can be 
substituted by 28 or by 54. The word “GUERRA” (WAR) would be ciphered as 167427285453. 


These codes, which were primarily substitution codes, were called trench codes and were in- 


tended for very specific uses. 


: The Clave Violeta (Violet Key, left) was 
COPIA Dé, LA CLAVE BNBMIGA suPLEADA - LA BRIGADA MIATA 09104. used by the 415th battalion of the 
LenB IATA 104th Republican Brigade, and was 


— ee | captured by the Nationalist side. The — 
ere | note translates as: “The ciphers will 
$2) U.¥bs¥odede ee necessarily have 2 ee ghana = 
“eR ‘eee? letters. The columns [rows] marked 


Hots: Zete clave ae réginen anterior se crea para uso : eee, Lo wi th a (1) corr espond to the alphabet. 
“Snby y  f a heeeieateme () Be eS ) 
sekes. 2if Fix Des esarisuente deberdn ser cusigoscae ¥ The columns mark with a (2) COr 
eo istras.-iae celumnss seiialadas con a signotl) 


webresyendes ti alfaee to.-hae coluanes sedaledse con “3 . respond to their equivalent in code. “ 


signe (2) cOrreapond em « au exidvabentes en cleave.— 


(2) KB Sa Ve Ree he i Ge de Ore Hes. G.1.1.)5.d. 
RES RISA SVL AaLS SS SS SS SPR SMe eee ee 


Bey us sellico ey» tinta 61 costadc derecho on forme de trianguie gus 
éice.-Brigeca dixts 104.-Bstslidéy 415. Beniro del triangulo.seayoria. - 
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The first rotor switch made a rotation that placed it in one of the 26 possible 
positions. The switch’s new position brought a new cipher of the letters, and the 
signals operator then entered the second letter, and so on. To decode the message, 
it was sufficient to enter the ciphered characters into another Enigma machine as 
long as the starting parameters of the second machine were the same as those of 
the machine that had carried out the encryption. 

Using the illustration on the following page, we can present a very simplified 


schematic of the Enigma’s encryption mechanism, using rotors with an alphabet of 


only three letters. As a result each rotor has only three possible positions instead of 
the 26 in the real thing. 


2 For a higher level of secrecy, the Nationalist side, headed by General Franco, ‘deployed ache 
weapon — 30 of the so-called Enigma machines supplied by their Nazi allies. This would be the 

first intensive military use of the ciphering device that Germany would come to use in World War 
7 The arash ieee. to break the code during the Spanish conflict, but without SUCCESS. 


iain : Telegram (left) of October 27, 1936, to the 
TELEG % | chief of the Granada Sector (Republican): “Your 
- « | telegram ciphered yesterday. pga indecipher- 
able.” : 


IL GRO y de oMbne...de (9a.. 
UE ncttihnincrdone 


2 te cease 2 oRPK ocr ees ME Ergin 


ble. BeolLs.¥.orcene eccargecn. ‘eitra @nMiusdiite ov’ a Cam 
.oo.este. brabajo aviter retroses y périca tiempo. 
Acedind.s: ».roconien.t. aaah on Denecrfiee, atenet én. “ 


An encoded Republican message (right) 
intercepted by the Spanish Falangist Fas- 
cist movement in the Canary Islands. 
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As we can see, with an Enigma machine’s rotor in the initial position, each letter 
of the original message is substituted by a different one except for A, which remains 
unchanged. After ciphering the first letter, the rotor does a one-third turn. In this 
new position, the letters are now substituted by different ones from those of the first 
cipher. The process concludes with the third letter, after which the rotor returns to 
its initial position and the sequence of the cipher will repeat itself. 

The rotary switches of a standard Enigma machine had 26 positions, one for 
each letter of the alphabet. Consequently, a single rotor could perform 26 different 
ciphers. Therefore, the initial position of the rotor is the key. To increase the number 
of possible keys, the design of the Enigma incorporated up to three rotors, connected 


mechanically one to the other. 
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So, when the first rotor 
completed a turn, the next 
one initiated another one, and 
so on until the complete rota- 
tions of all the rotors ended, for 
a total of 26 x 26 x 26 = 17,576 
possible ciphers. In addition, 
Scherbius’s design allowed for 
exchanging the order of the 
switches, thus increasing the 
number of codes even more, 
as we shall see below. 

Besides the three rotors, 
Enigma also had a plugboard 


located between the first ro- 


tor and the keyboard. The A three-rotored Enigma machine with its casing partly 
plugboard allowed for the removed to show its plugboard (at the front). 
interchange of pairs of letters 

before they were connected to the switch, and in this way added a considerable 
number of codes to the cipher. The standard design of the Enigma machine had 
six cables that could interchange up to six pairs of letters. The following illustration 
shows the operation of the interchanging plugboard, again in a simplified form of 


only three letters and three cables. 
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In this way, the A swaps with the C, the B with the A, and the C with the B. 
With the addition of a plugboard, a simplified three-letter Enigma machine would 


function as follows 


Pligbeard.. -.——————— _ Kotos 


How many more codes did the seemingly trivial addition of the plugboard pro- 
vide? We have to consider the number of ways of connecting the six pairs of letters 
selected from a group of 26.The possible number of transformations of n pairs of 


letters of an alphabet of N characters is determined by the following formula: 


N! 
(N—2n)!-n!-2"— 


In our example, N = 26 and n = 6, and that gives us a mere 100,391,791,500 
combinations. 
Consequently, the total number of ciphers offered by the Enigma machine with 


- three 26-letter rotors and a plugboard with six cables is the following 


1. With reference to the rotations of the rotary switches, 26° = 26:26:26 = 17,576 
combinations. 

2. Likewise, the three rotors (1, 2, 3) could interchange with each other and 
could occupy the positions 1-2-3, 1-3-2, 2-1-3, 2-3-1, 3-1-2, 3-2-1; this gives 
us six possible additional combinations. 

3. Finally, we have calculated that the arrangement of the six cables of the initial 
plugboard added 100,391,791,500 additional ciphers. 
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The total number of ciphers is obtained from the product of the different 
specified combinations, 6-17,576-100,391,791,500 = 10,586,916,764,424,.000. There- 
fore, Enigma machines could cipher a text using more than ten-thousand-tril- 
lion different combinations. The German Reich was secure in the belief 


that their highest level communications were utterly safe. This was a big mistake. 


Deciphering the Enigma code 


Any Enigma key first specified the configuration of the plugboard for each of the 
six possible letter interchanges — for example, B/Z, F/Y, R/C,T/H, E/O and L/J, 
which indicated that the first cable interchanged the letters B and Z, and so on. 
Secondly the key showed the order of the rotors (such as 2-3-1), and lastly, the key 
included the starting orientation of rotors (such as R, V, B, indicating which let- 
ter was located at the starting point, or index mark). These settings were collected 
in code books that were themselves transmitted in an encrypted form and could 
change from one day to the next or when other circumstances dictated. For exam- 
ple, certain keys were reserved for certain types of message. 

To avoid repeating the same code throughout the day — during which thou- 
sands of messages could be sent — Enigma’s operators had some ingenious tricks for 
transmitting new codes, of restricted use, without having to alter the entire book 
of shared codes. So, the despatcher sent a six-letter message, codified according to 
the applicable daily code, that was actually a new set of index marks for the rotors, 
for example T-Y-J. (For greater security, the sender codified these three instructions 
twice, hence the six letters). Next, he would code the real message according to this 
new arrangement. The recipient received a message that he could not decipher with 
the code of the day, but he knew that the first six letters were actually instructions 
to arrange the rotors in another position. The receiver would do this, keeping the 
plugboard and the order of the rotors unchanged, and could then correctly decrypt 
the message. 

The Allies obtained the first valuable information relating to Enigma in 1931 
from a German spy, Hans-Thilo Schmidt. This consisted of various manuals for 
the practical use of the machine. The contact with Schmidt was made by French 
intelligence services who subsequently shared information with their Polish coun- 
terparts. The Polish department of cryptanalysis, the Biuro Szyfréw (cipher bureau), 
went to work on Schmidt’s documents and it got hold of various Enigma machines 


stolen from the Germans. 
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In an unusual move for the time, the Polish code-breaking team included a large. 
number of mathematicians. Among them was a talented, introspective and shy young 
man of 23 by the name of Marian Rejewski. He immediately concentrated his ef- 
forts on the six-letter codes that preceded many of the daily messages exchanged 
by the Germans. Rejewski theorised that the second three letters of the code were 
a new cipher of the first three and knew, therefore, the fourth, fifth and sixth letters 
could give a clue to the rotation of the switches. — 

From this discovery, as small as it might appear, Rejewski built an extraordi- 
nary network of deductions that would lead to the breaking of the Enigma code. 
The details of this process are very complex, and we will not expound them here, 
but the fact is that, after a few months, Rejewski had reduced the number of 
possible codes that needed to be deciphered from ten-thousand billion to just 
105,456 that resulted from different combinations of the order of the switches 
and their different rotations. To do this, Rejewski built a device, known as the 
Bombe, that functioned in the same way as the Enigma and that could simulate 
any of the possible positions of the three rotors in search of the daily code. As early 
as 1934, the Biuro Szyfr6w had broken Enigma and could nea any message 
within 24 hours. 

Although the Germans did not know that the Poles had penetrated Enigma’s 
security, they still added improvements to a system that, after all, had already been 
operating for more than a decade. In 1938, the Enigma operators received two more 
rotors to add to the three standard positions and, shortly thereafter, new models of 
the machine were distributed with ten cable pegboards. 

Suddenly, the number of possible codes increased to about 159 quintillion. The 
addition, alone, of two more rotors to the rotation of the switches increased the 
possible combination of arrangements from six to 60. That is, any one of the five 
rotors in the first position (five options) multiplied by any one of the four remain- 
ing rotors in the second position (four options) multiplied by any one of the three 
rotors in the third position (three options) = 5 x 4 x 3 = 60. Although they knew 
how to decipher the code, the Biuro Szyfrow lacked the means necessary to analyse | 


10 times as many new rotor configurations all at once. 
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Some versions of the Enigma machine. 


The British take over 


The upgrade to the Enigma system was not accidental: Germany had already be- 
gun its aggressive expansion through Europe with the annexation of Czecho- 
slovakia and Austria, and was planning the invasion of Poland. In 1939, with the 
conflict now unleashed in the heart of Europe, and their country conquered, the 
Poles transferred all their Enigma machines and understanding to their British 
allies who, in August of that year, decided to bring together their previously dis- 
persed cryptanalytic units. The location selected was a mansion situated on the 
outskirts of London, in an estate called Bletchley Park. A brilliant new cryptanalyst 
was added to the team at Bletchley Park, a young Cambridge mathematician called 
Alan Turing. Turing was a world authority in the sphere of computing, then still 
an embryonic field, and open to new and revolutionary developments. Decipher- 
ing the improved Enigma machines proved to be the impetus behind several leaps 


forward in computing. 
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. Experts at work at Bletchley Park where the Enigma code was deciphered. 


The experts at Bletchley Park concentrated on short fragments of ciphered text 
that they suspected corresponded to segments of plaintext. For example, thanks to 
their spies on the ground, it was known that the Germans had the habit of transmit- 
ting a codified message about the meteorological conditions at various locations 
along the front line around 6 p.m. every day. Therefore, they were reasonably certain 
that a message intercepted shortly after that hour contained a ciphered version of 
plaintexts such as “weather” and “rain.” Turing invented an electrical system that 
allowed for the reproduction of all and every one of the 1,054,650 possible com- 
binations of the order and position of the three rotors in less than five hours. This 
system was fed with ciphered words that, by the length of their characters and other 
clues, were suspected to correspond to fragments of plaintext such as the above- 
mentioned weather and rain. 

Let us suppose that they suspected that the text ciphered FGRTY was an en- 
crypted version of “bread”. The cipher would be entered into the machine and if 
there was a combination of rotors that gave the word “bread” as a result, the cryp- 
tanalysts knew that they had found the codes that corresponded to the configuration 
of the rotary switches. Next, the operator entered the ciphered text in a real Enigma 
machine with the rotors arranged according to.the code. If the machine showed a 


deciphered text DREAB, for example, it was clear that the part of the code relating 
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to the position of the plugboard cables included the transposition of the letters D 
and B. In this way, they obtained the entire code. Enigma’s secrets were definitely 
becoming known. In the process of developing and refining the above-mentioned 
analytic mechanisms, the team at Bletchley Park built the first digital and program- 


mable computer in history, christened Colossus. 


Colossus, the forerunner of the modern computer, at Bletchley Park. The photograph, taken in 
1943, shows the control panel of the complex device. 


Other ciphers of World War Il 


Japan developed two of its own encoding systems: known as Purple and JN-25. 
The first one was used for diplomatic communications and the second to send 
military messages. Both ciphers were carried out by mechanical devices. JN-25, for 
example, consisted of a substitution algorithm that translated the written characters 
of the Japanese language (up to a limit of 30,000 characters) into series of num- 
bers as specified by random tables of five number groups. Despite the precautions 
taken by the Japanese, the British and Americans cracked the Purple and the JN- 
25 codes. The intelligence obtained thanks to the interception of the Purple and 
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JN-25 ciphers was codenamed Magic, and had considerable impact during pivotal 
encounters in the Pacific war, particularly the Battles of the Coral Sea and Midway, 
both in 1942. Magic’s intelligence was also used to plan strategic missions, such 
as the interception and shooting down of Japanese military commander Admiral 


Yamamoto’s plane the following year. 


A TRULY BRILLIANT MIND 


Alan Turing (left) was born in England 
in 1912. Even when young, he showed 
a great aptitude for mathematics and 
physics. In 1931, he went to Cambridge 
University where he became interested 
in the work of the logician Kurt Gédel 
into the general problem of inherent 
incompleteness of any logical system. 
Three years before he had published a 
Study on the theoretical possibility of 
building machines that were capable of 
computing different algorithms such as 
addition, multiplication, etc. Inspired by 
Gédel’s works, in 1937 Turing took his 


ideas on the limits of proof and com- 


putation a step forward and established 
the principles of a “universal machine” capable of performing any conceivable algorithmic 
computation. Thus was born one of the pillars of modern information theory. Two years be- 
fore, Turing had made contact with the great Hungarian mathematician Janos von Neumann, 
who was, by that time. living in the United States and better known as John. Von Neumann, 
considered the “other father” of computing, offered Turing a job at Princeton, a well paid 
and highly prestigious position. However, Turing preferred the bohemian atmosphere at Cam- 
bridge and declined the offer. In 1939, as war broke out, he joined the British cryptanalysis 
team at Bletchley Park. His work during the war earned him an OBE (Order of the British 
Empire), but Turing was a homosexual — illegal at the time — and a conviction in 1952 made it 


impossible for him to work on secret government projects. Profoundly depressed by the rejec- 


nana oe meh eaneent a4 ieee e aeeceeneaee samme Oe 


tion, Alan Turing committed suicide on June 8, 1954, by swallowing potassium cyanide. 
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The Navajo code talkers 


While the United States made good use of information intercepted from the en- 
emy in the Pacific theatre of operations, the US military’s own communications 
used several codes — in the strict sense of the word as discussed at the beginning 
of the book. The encryption algorithms operated directly on the nature of the 
words. These codes — the Choctaw, the Comanche, the Meskwaki, and, above all, 
the Navajo — were not explicitly set out in complicated manuals, nor were they the 
result of planning by a judicious department of cryptographers: they were simply 
authentic Native American languages. 

The United States army placed radio operators from these native groups in 
various units along the front, and charged them with transmitting messages in their 
respective languages, which were unknown not only to the Japanese, but also to the 
rest of the American forces.A set of basic codes was superimposed on these ciphered 
messages to prevent a captured soldier from being forced to translate them. These 


“code talkers” served in American units until the Korean War. 


Two Navajo “code talkers” during the Battle of Bougainville in 1943. 
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Innovations: Hill's cipher 


The ciphers discussed up to this point, in which one character is substituted by 
another in some pre-established manner, are always vulnerable to being cracked by 
cryptanalysis, as we have seen. 

In 1929, the US mathematician Lester S. Hill invented, patented and put up for 
sale — unsuccessfully — a new ciphering system that made use of a combination of 
modular arithmetic and linear algebra. 

As we shall see below, a matrix can be a very useful tool to cipher a message, by 
composing the text into pairs of letters and associating each letter with a numeri- 
cal value. 


To cipher a message, we use a matrix: 


with the restriction that its determinant be 1, that is, that ad-bc = 1.To decipher it, 


we use the inverse matrix: 


anda matrix of 2x 1is of the form: i at 
The product of both these matrices gives usa new matrix 2 x 1, called a column vector: _ - 


Se er 


In the case of the matrix 2 x 2 the value ad-bc is called the determinant of the matrix. 
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The restriction in the value of the determinant is set so that the inverse matrix 
will function as a deciphering tool. As a rule, for an alphabet of n characters, it is 
necessary that the gcd (the determinant of A, m) = 1. If the opposite were true, the 
existence of the inverse in modular arithmetic could not be guaranteed. 

Continuing the example, we take an alphabet of 26 letters with a “blank space” 
character, which for purposes of this example we will designate as @. We assign 


each letter with a numerical value as shown in the following table: 


alelcioleleiciH| iii} ]eiminjolelalais|rjulviwix|y|z je, 
o}1]2]3/4}5]6]7|8}9]10]11| 12} 13} 14] 15/16] 17] 18] 19] 20] 21] 22|23|24]25|26 


To obtain values between 0 and 26, we will work in modulus 27. 


The process of ciphering and deciphering the text is as follows: First we deter- 


mine a ciphered matrix A with determinant 1. 


For example, A= ae 
oso 


The deciphered matrix will be the inverse matrix A -( = } 


Therefore, A will be the key of the cipher, and A” is the decipher key. 

Below, for example, we establish the message “BOY.” The letters of the message 
are grouped in pairs: BO Y@. Their numerical equivalents according to the table 
are the pairs of numbers (1, 14) and (24, 26). Next we multiply matrix A by each 


pair of numbers 


Ciphered “BO” =BO = 8 = (')- & é () inod. 27). 
2-97 Xi 100 19 


that, according to the table, corresponds to the letters (Q,T). 


cnes-varsva~ (% 3) (8) = (Mt) = (2) smn 


that corresponds to the letters (V, O). 


The message “BOY” is ciphered “QTVO” 
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For the deciphering, the inverse operation is performed using the matrix: 


PS 
ay eee wee 


We take the pair of letters (Q,T) and seek their numerical equivalents from the 
table: (16, 19). We then multiply them by A", and get: 


ele 3 ) E 3) > ae = ( me (mod. 27),equivalent to (B, O) 


We do the same with the second pair (V, O) and their numerical values (21, 14) 


and we get: 


& =) ioe - = = (34 (mod. 27), equivalent to (Y, @). 


We have then proven that the deciphering key works. 


For this example we have considered pairs of two characters. We would have 
greater security if we grouped the letters in threes or even fours. In these cases, the 
calculations would be made with matrices 3 x 3 and 4 x 4, respectively, which would 
be extremely laborious if carried out manually. With today’s computers, however, it 
is possible to work with huge matrices, and with their respective inverses. 

Hill’s cipher suffers from an important weakness: if the recipient has a small 
fragment of the plaintext, it is possible to decipher the entire message. The search 


for the perfect cipher was far from over. 
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OQ and 1 


The invention of the Colossus computer and the breaking of the Enigma code 
opened the door to the greatest communication revolution known to humanity.This 
gigantic step forward was based to a large extent on the development of an encryp- 
tion system that enabled secure, efficient and rapid communications across a vast 
network driven by two fundamental agents: computers and their users — you and 
me. When we use the word security today, we are not just referring to cryptography 
and secrecy. The word also has a much broader sense that also encompasses notions 
of reliability and efficiency. 

The binary system forms the basis of the technological revolution. This super- 
simple code formed by two characters, 0 and 1, is used in computing for its ability 
to represent the interaction of the electronic circuits in a computer (i.e a circuit is 
on, represented by 1, or off; represented by 0). Each 0 and each 1 is termed a bit (a 


term derived from binary digit). 


The ASCII code 


One of the binary system’s many applications is a specific family of characters each 
with a length of 8 bits — known as a byte. These characters are alphanumeric and rep- 
resent the basic symbols used in conventional communication. They are termed the 
ASCII (American Standard Code for Information Interchange) codes. The number | 
of ways of arranging 0 and 1 in a group is: 2° = 256. 


ASCII codes allows users to enter text into a computer. When we type an alpha- 


‘MEMORY BYTES 


The memory and storage capacity of a computer is measured in multiples of bytes: 
Kilobyte (kB): 1,024 bytes Gigabyte (GB): 1,073,741,824 bytes 
Megabyte (MB): 1,048,576 bytes _ Terabyte (TB): 1,099,511,627,776 bytes 
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numeric character, the computer converts it into a byte of data —a chain of eight bits. 
So, for example, if we type the letter A, the computer converts it into 0100 0001. 
Binary ASCII values are given to all the characters in common usage — 26 capital 
letters, 26 lower-case letters, 10 numerical digits, 7 symbols of punctuation and some 
special characters. All are shown in the following table. The corresponding decimal 


number (in the column headed ‘Dec’) is given for each character’s binary code: 


ASCIl TABLE 


"space! | 00100000] 32 | @ | o1000000| 64 |_| 01100000 | 96 
r+ [ooio0001 | 33] a | 01000001 | 65 | 2 | orr0.0001 | 97 
——TFooiooo10 | 3¢ | 8 | o1oo0010 | 66 |b | orroo010 | 98 
—¥Tooiooor | 35 |e [oroooorr [a7 |e] onrooort | 99 
[—sTooto 0100] 36 |p | 01000100 | 68 [a | orr0.0100 | 100 
——Footoor | 39] 6 orooonnt | 71 
105 


ala 
e>) 


ee 
ae ae 
Soaked 


a8 

ee 
107 
|. fooroiroo | a4 | tL | o1001100 | 76 
| =f oorr01 | 45 | M__| oto 101 | 77 _ 
|. [ooo | 46 | ON | ooo1110 | 78 110 
| [ooo | 47 | o [oor | 79 | oo | oniosnaa | 111 | 
[0 [oorro000 | 4s | P| 01010000 | 80 | p | 01110000 | 112 | 
| 1 _[oorrooo1 | 49 | Q [ o1orooo1 | 81 | a | 01110001 | 113 | 
so | R | o1o10010 | 82_ 
84 
u 
| 6 [oorroio | 54 | v_ | otororto | 86_ "1 


01111000 | 120 


119 
| 8 | oorrto00 | 56 | x | 01011000 | 88_ 

| 9 foorrio01 | 57 | ¥__| o1o1 1001 | 89 121 
|: foortoro | 58 |Z | oto 1010 | 90_ 122 
Ses Ue ee 123 


Bn 


r 

S 

t 

u 

V 

W 

X 

y 

|< [ooisoo] co | \ | ororsoo] 92 | | | 01111100 | 12 

|= foot { 6 | i | ororstor | 93 |} | ont ston | 125 
ae 
ee 


ce) 


12 
|? foo { 63 | - | otorisi | 95 ee 
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When typing “GOTO 2”, a phrase in the programming language BASIC, the 


computer would translate the characters into the corresponding binary sequence: 


Blank 
space 
Translation into 
computer 01000111 | 01001111 | 01010100 | 01001111 |0010 0000; 00110010 
language 


The computer would thus execute the sequence: 


010001110100111101010100010011110010000000110010 


The Hexadecimal system 


The hexadecimal system is another notable code used in computing. It is a number 
system that works with sixteen unique digits (hence hexadecimal), as opposed to 
the normal system that uses ten (decimal). One could say that the hexadecimal 
system is the computer’s second language after binary. Why a 16-digit system? 
Remember that the computer's basic unit of operation, the byte, is composed of 
eight bits, which produces up to 2° = 256 different combinations of 0 and 1. 2° = 
2*x 2* = 16 x 16. In other words, the combination of two hexadecimal number 
equals 1 byte. 

The sixteen digits of a hexadecimal system are the traditional 0, 1, 2, 3, 4, 5, 6, 
7,8, 9, and six more established by convention: A, B, C, D, E, ETo count in a hexa- 


decimal system, we do as follows: 
From 0 to 15: 0, 1, 2, 3, 4,5, 6, 7, 8,9,A, B,C, D, E, EF 


From 16 to 31: 10, 11, 12,13, 14, 15, 16, 17, 18, 19, 1A, 1B, 1C, 1D, 1E,.1F 
From 32 on: 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 2A, 2B, 2C... 
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BP2CADRO- 
C16C~478E~... CRIFFEOSIG 


LM, 


C24BLARS- 
CCZ3-452?... 71BB29EEZ 


> 


ThumbnaiiData. xt 


These files were generated automatically by a computer. Their strange names 
are actually hexadecimal numbers. 


Hexadecimal digits do not distinguish between upper and lower case letters (1E 
means the same as 1e). The following table shows the first 16 binary numbers and 


their hexadecimal equivalents: 


Hexadecimal 


Binary 
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To go from binary to hexadecimal, we group the bits in four groups of four 
from the right, and we complete the conversion according to the previous table. If 
the number of binary digits is not a multiple of four, we fill in the difference with 
Q from the left. To go from hexadecimal to binary, we convert each hexadecimal 


digit into its binary equivalent, as in the following example: 


9F2, , 1s the formal notation of a hexadecimal number (denoted by the subscript 


16). Remember the corresponding binary is: 


Pees a 


so 9F2,, = 100111110010, (Note: the subscript 2 indicates that the number is 


expressed in a binary system). 


Let's now carry out the reverse process: 1110100110, has ten digits. Therefore, 
we complete the number with two zeroes on the left to have 12 digits that we can 


group by fours. 
We convert: 


1110100110, = 0011 1010 0110, =3A6,.. 


What is the relationship between hexadecimal characters and ASCII codes? 
Every ASCII code contains eight bits (one byte) of information, therefore five ASCII 
characters contain 40 bits (five bytes) and, since a hexadecimal character contains 
four bits, we conclude that five ASCII characters are 10 hexadecimal characters. 

Let’s see an example of coding a phrase in hexadecimal code. Let’s try it with 
the name “NotRealCo Ltd”, following these steps. 


1. We translate “NotRealCo Ltd” into its binary version with standard ASCII. 
2.We group the digits by fours. (If the length of the binary string is not a multiple 
of four, we add 0 to the left). 

3. We consult the binary and hexadecimal conversion table and continue with the 


translation. 
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peep es 


Binary 
equivalence {19911 40/01101111]011 10100|01110010]01100101}01 10000101101 1000110001101 101111 foot00000 
according to 
ASCII 
secmcoegl eee a ee ee ee ee 
translation 


meron coven erate ae Sees Gn 
Binary equivalence according to ASCII 01001011 01110100 | 01100100 


Therefore, the phrase “NotRealCo Ltd” ciphered in hexadecimal, is as follows: 


4E 6F 74 72 65 61 6C 63 6F 20 48 74 64 


Numeral systems and base changes 


A numeral system of n digits is also said to be of base n. Human hands have ten fin- 
gers, and that is probably why the decimal numeral system was invented — counting 
was carried out with fingers. A decimal number such as 7392 represents a quantity 
equal to 7 thousands 3 hundreds 9 tens and 2 units. Thousands, hundreds, tens, units 
are powers of a base number system; in this case, 10. The number 7392, therefore, 


could be expressed as: 


7392 =7-103 +3-102 + 9-10! +2-10°. 


However there is an implicit agreement that we only write the coefficients (7, 
3,9 and 2). Besides the decimal system, there are many other numeral systems (in 
fact, their total number is infinite). In this volume we have paid special attention 
to two systems: the binary system of base 2, and the hexadecimal, of base 16. In a 
binary numeral system, the coefficients only have two possible values: 0 and 1.The 
digits of the binary numbers are coefficients of the power of 2. So, the number 
11011, could also be written as 


LIOLL, == 15 24 12? 02? 43-2) 2. 
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If we calculate the expression to the right of the equals sign, we get 27, which is 
the decimal form of the binary number 11011. For the inverse process, we succes- 
sively divide the decimal number by 2 (the binary base), and we make a note of the 
remainders until we obtain a coefficient of 0.The binary number will have the final 
coefficient as its first digit, and this will be followed by the remainders starting with 


the last in the list. To visualise the process, we will write the number 76 in binary: 


76 divided by 2 has a coefficient of 38 and a remainder of 0. 
38 divided by 2 has a coefficient of 19 and a remainder of 0. 
19 divided by 2 has a coefficient of 9 and a remainder of 1. 
9 divided by 2 has a coefficient of 4 and a remainder of 1. 

4 divided by 2 has a coefficient of 2 and a remainder of 0. 

2 divided by 2 has a coefficient of 1 and a remainder of 0. 


Therefore, the number 76 written in a binary system would be 1001100, This 
result can be verified in the previous ASCII table (keep in mind that in the cor- 
responding code we include an additional 0 at the beginning to create strings of 
four digits). Converting a quantity expressed in one numeral system to another is 


called a base change. 


Codes for detecting transmission errors 


The codes outlined above make it possible for secure and effective communica- 
tions between computers, between programs and between users. But this on-line 
language is based on a general theory of information that underlies the process of 
communication itself. The first step in formulating this theory is so basic that it is 
sometimes easy to overlook: how to measure information. 

A phrase as simple as “2 kB attachment” is based a long series of brilliant in- 
tuitions that start with an article published in two parts in 1948 by the American 
engineer, Claude E. Shannon, and titled A Mathematical Theory of Communication. In 
this seminal article, Shannon proposed a unit of measurement for the quantity of 
information that he called a bit. The general problem that led to Shannon’s work 
was one that will be familiar to modern readers. What is the best way to encrypt 
a message to prevent it being corrupted during transmission? Shannon conclud- 
ed that it was impossible to define a code that would always prevent the loss of 


information. Put another way, errors will inevitably occur when information is 
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transmitted. However, this conclusion did not halt efforts to define standards of 
codification that, even if they could not prevent corruption, could at least ensure 
the highest levels of reliability. 3 

In digital transmission of information, once a message has been generated by the 
sender (that can easily be a non-human agent, such as a computer or some other 
device), it is encrypted in a binary system and enters a channel of communication 
that consists of the sender’s computer and that of the receiver plus the connection 
itself, which is either a physical cable or wireless (radio waves, infrared etc). The 
journey through the channel is the most sensitive process because the message can 
be subjected to all kinds of interference, including mixing with other signals, the 
adverse affects of temperature in the physical medium, and attenuation (weaken- 
ing) of the signal as it passes through the medium. These sources of interference 
are termed noise. 

To minimise the impact of noise, not only do you have to protect the connec- 
tion, you also have to establish a way of detecting errors and correcting them when 
they arise. 

One of these methods is called redundancy. Redundancy consists of the repeti- 
tion, under determined criteria, of certain characteristics of the message. Here is 
an example that will help to clarify the process. Let us imagine text in which each 
word is made up of four bits, for a total of 16 words (24 =16), each one of the type 
a,a,a,a,. Before sending a message we add three additional bits to the word ¢,c,c,, 
so that the encoded message as it travels through the communication channel will 
have the form a,a,a,a,c,c,c,. The elements c,c,c, will ensure the security of the mes- 


sage — they are called parity codes — and they are generated as follows: 


0 ifa, +a, +4, is even 


1 ifa, +a, +a, is odd 


Oifa, +a, +a, is even 


1ifa, +a, +a, is odd 


@) if a, +a, +4, is even 


lifa,+a,+a, is odd 
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We would assign the following parity codes to the message 0111: 


Since 0+1+1=2 even, the number c,=0 
Since 0+1+1=2 even, the number c,=0 


Since 1+1+1=3 odd, the number c=1 


Consequently, the message 0111 would be transmitted as 0111001. From the 


following 16 “words” we thus get the table: 


GENIUS WITHOUT A PRIZE 


Claude Elwood Shannon (1916-2001) was one of the greatest scientific 
figures of the 20th century. Educated in electrical engineering at the Uni- 
versity of Michigan and the Massachusetts Institute of Technology, he 
worked as a mathematician at Bell Labs where he did research on cryp- 

tography and communication theory. His contributions to information | 
theory are sufficient to place him at the top table of innovators, but since 
his work was halfway between mathematics and information technol- 


ogy, he never received the prize coveted by all scientists: the Nobel. 
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Let us suppose that at the end of the journey, the receiving system gets the 
message 1010110. Note that this combination of 0 and 1 is not among the possible 
messages and must, therefore, be a transmission error. To try to correct the error, 
the system compares each digit with the set of digits of possible messages to find a 
more probable alternative. To do so, it checks how many of the digits appear to be 


wrong, as we show below: 


Possible message 9000000 | 0001011. | 0010111 | 0100101 | 1000110 
1010110 |. 1010110 | 1010110 | 1010110 | 1010110 


- | Number of different digits 5 2 5 1 
in each position 


_Pgssible message 1100011 | 1010001 | 1001101 | 0110010 | 0101110 
1010110 | 1010110 | 1010110 | 1010110 | 1010110 


Number of diff igi | 
in each position 


Possible message 0011100 | 1110100 | 1101000 | 1011010 | 0111001 | 1111111 
| Received message _| 1010110 1010110 | 1010110 | 1010110 | 1010110 | 1010110 


Number of different digits 
gee = ee ee 
in each position 


The erroneous word (1010110) differs from another word (1000110 )by a single 


digit. Since the difference is the smallest, the system will offer the recipient this 


second, corrected version. The principle is analogous to that of the spell checker on 
a word processor. When it detects a term that does not register in its internal dic- 
tionary, it proposes a series of close alternatives. The number of positions by which 
a message, understood as a sequence of characters, differs from another is known 
as the distance between two sequences. This specific mechanism of error detection and 
error correction was proposed by the American Richard W. Hamming (1915-1998), 
a contemporary of Claude Shannon. 

In information, as in any other field, it is one thing to detect the possible er- 
rors, and quite another to correct them. In encryptions, such as this last example, if 
there is only one candidate of minimal distance, the problem is simple enough. If 
we call ¢ the minimum number of time that 1 appears in the sequence (omitting 


the sequence that is all 0), we can verify that: 
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If t is odd, we can correct 7 errors. 
Z 


If t is even, we can correct t-2 errors. 
2 
If our only purpose is to detect errors, the maximum number we can detect 
will be t—1. In the 16-character language expounded before, t= 3, from which 
we get that the mechanism is capable of detecting 3—1=2 errors, and to correct 
(3—1):2=1 error. 


THIRD GENERATION CRYPTOGRAPHY 


In 1997, a protocol was introduced for the secure transmission of information through wire- 

less networks by the name of WEP (the acronym for Wired Equivalent Privacy). This protocol 
includes an encrypting algorithm called RC4, with two types of codes of 5 and 13 ASCII char- 

acters respectively. We are dealing, therefore, with codes of 40 or 104 bits or, alternatively, of 

10 or 26 hexadecimal characters: — | | | 

5 alphanumeric letters = 40 bits = 10 hexadecimal characters 
13 alphanumeric letters = 104 bits = 26 hexadecimal characters 

The connection provider supplies the codes, although the user can generally change them. Be- 

fore establishing the connection, the computer asks for the key. In the following dialogue box 

we see an error message asking for the WEP key, specifying its ong | in bits, ASCII characters 


and nEXageCirria} characters: 


: Wireless configuration, — 


‘The retwork paseviored hasts be #0 bbs or 10d bls ccenciind arta punncth conkig reine: : 
Kean be written as 5 or 13 ASCI characters, or Wor 26 hexadecimal characters 


_ In truth, the real keys are longer. Starting with those supplied by the user, the algorithm RC4 
generates a new key with more bits, which is the one used to cipher the transmission. This is 
public-key cryptography and it will be explained in more detail in Chapter 5. A user who wishes 
to change the key will do well to remember that a key of ten hexadecimal characters will be 

-more secure than a key of five alphanumeric characters, although the bit size is the same. Of _ 
course it is also certain that “james” is easier to remember om its hexadecimal equivalent, 

- "6A616D6573". : 
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Other codes: the standards of industry and commerce 


Although less glamorous than cryptography or binary mathematics, and often in- | 
visible to us despite their ubiquity, the standardised codes of banks, supermarkets, 
and other large economic players are one of the pillars that support modern society. 
In the case of these codes, the priority is to ensure the unique and accurate identi- 
fication of products, be they bank accounts, books or apples. We will now examine 


them in more detail. 


Credit cards 


The debit and credit cards offered by major banks and department stores are 
essentially identified by set groups of numbers and calculated with the same al- 
gorithm and verification system, all based on our old friend, modular arithmetic. 
The majority of cards have 16 digits, made up of numbers between 0 and 9. The 
numbers are grouped in 4 digits so they can be read more easily. For our purposes 


we will denote them as: 


ABCD EFGH IJKL MNOP 


Each group of digits codifies some piece of information: the first group (ABCD) 
corresponds to the ID of the bank (or whichever entity is providing the service). 
Each bank has a different number that may vary according to the continent, and 
that is also related to the card’s brand and conditions. For example, in the case of 


VISA and some prominent banks, the first four numbers are as follows: 


4940 Citibank 


4024 Bank of America 
4128 Citibank (USA) 
4302 HSBC 


The fifth digit (E) corresponds to the type of card and indicates which financial 


institution is administering the account: 
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a 


As we can see, it is not a rigid rule. 


The following ten digits (FGH IJKL MNO) are a unique identifier for each card. 
This identification not only supplies a reference number for each client account, but 
it is also linked to the branding of the card — Classic, Gold, Platinum etc — and the 
associated credit limit, interest rates on type of balance and its expiration date. 

Finally, there is a control digit (P) that relates to the previous digits according to 
Luhn’s algorithm, so called in honour of Hans Peter Luhn, the German engineer 


that developed it. For a 16-digit card, this algorithm works as follows: 


1) For each digit in an odd position, starting with the first number on the left, 
we calculate a new digit by multiplying it by two. If the result of this multiplica- 
tion is greater than 9, we add the two digits of the new number (or we perform 
the equivalent operation of subtracting 9). For example, if we get 18, we add 1 
+ 8 = 9, or else we subtract 18 — 9 = 9. 

2) Next, we add all the numbers calculated in this way, and the digits located in 
even positions (including the final control digit). 

3) If the total is a multiple of 10 (that is, its value is 0 in mod 10), the numbers 
on the card are valid. Note that it is the final control digit that makes the even- 
tual total a multiple of 10. 


One of the fi rst credit cards to gain wide acceptance was Diner's Club. The driving force behind - 
it was the American Frank McNamara. In 1950, he managed to persuade various restaurants to : : 
| accept payment by credit when offered with a personalised, ‘guaranteed card that McNamara . 
: distributed to his best clients. The most common use of credit cards | in their first decades was 


for American travelling salesmen to pay for meals while c on the road. 
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For example, in the case of a card numbered as follows: 
1234 5678 9012 3452 
According to Luhn’s algorithm: 


1:2 =2 

3:2 =6 

5:2=10—>14+0=1 

7.2=14 = 144 =5 (or 14-9 =5) | 

92=18 > 14+8=9 

1282 

3.2=6 

5.2=10=>14+0=1 
2+64+14+54+942+6+4+1=32 
2+4+64+84+042+4+2=28 

32 + 28 = 60 


The result is 60, a multiple of 10. Therefore the card’s code number is valid. 


Another way to apply Luhn’s algorithm is as follows: the number of card ABCD 
EFGH IJKL MNOP is correct if the double of the sum of the digits in an odd posi- 
tion and the sum of the digits in an equal position plus the number of digits in an odd 
position that are greater than 4 is a multiple of 10. That mouthful is perhaps better 
expressed as 2,A+C+E+G+I+K+M+0O)+(B+D+F+H+J+L+N+P)+ the 
number of digits in an odd position greater than 4 = 0 (mod. 10). 


Applying this second version of the algorithm to the earlier example: 


1234 5678 9012 3452 


2:(143454749414345)4+(24+44+64+84+04+24+442)4+4= 
= 100 = 0 (mod. 10). 


Again we have verified that the number is a valid credit card number and have 


shown that apparently random card codes follow a strict mathematical standard. 
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EXCEL APPLICATION FOR THE CALCULATION OF THE CONTROL 
DIGIT OF A CREDIT CARD - | 


: | The n number associated with a credit card consists of 15 digits plus a control code. The numbers 
: are grouped | in four sets of four digits. The control digit (C.D.) is calculated according to the 


. algorithm below: 


C.D.. 
Creditcardno. = [5 5 2 1] 


=§ Digits used S21 
f. Digits in even position 2 
| ve Sum of digits in even position 
- @@ Number of digits in even position greater than 4 
' Sum of the two previous quantities 
46 Digits in odd position 5 1 
a ' Sum of digits in odd position 
AZ Sum of the two preceding results plus 1 
: 43 Remainder of dividing the previous result by 10 : 
44 The C.D. is 0 if the previous result is 0, otherwise, it is 10 less the previous result 


$:.3..3.:2 Sa ee Se 2.6 2 
4 7 6 6 3 2 


Would it be possible to recover a digit missing from a card code? Yes, as long as 
we are dealing with a valid credit card. Let us solve the value of X in the number 
4539 4512 03X8 7356. 

We start by multiplying by 2 the numbers in the odd positions (4-3-4-1-0-X- 
-7-5), reducing them to a single digit. 


4:-2=8 
3-2=6 
4:2=8 
1:2=2 
0-2=0 
X2=2X 


7-2=14, 14-9=5 
5-2=10, 10-9=1. 


We add the digits of the even positions and the new digits from the odd posi- 


tions and we get: 
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30+414+2X =71+2X 
71+ 2X, which we know has to be a multiple of 10. 


If the value of X were greater than 4 (and less than 10), 2X would be a number 
between 10 and 18.The value of 2X reduced to a single digit is 

2X — 9,so the previous sum would be 71 + 2X — 9.The only value of X that 
would make the expression a multiple of 10 is 9. If, on the contrary, X were less 
than or equal to 4, we see that there is no value that proves that 71 + 2X is a mul- 
tiple of 10. 

Consequently, the lost digit is 9,and the complete number of the credit card 1s 
4539 4512 0398 7356. 


Barcodes 


The first barcode system was patented on October 7, 1952, by Americans Norman 
Woodland and Bernard Silver. The early codes were quite different from todays. 
In place of the familiar bars, Woodland and Silver thought in terms of concentric 
circles. The first official use of a barcode in a shop was in 1974 in Troy, Ohio. 

The modern barcode consists of a series of black bars (which are coded as 1 
in the binary system) and the blank spaces between them (which are coded as 0). 
Barcodes are used to identify physical objects. The codes are generally printed on 
labels and are read by an optical device. This device, similar to a scanner, measures 
the reflected light and converts bands of areas of dark and light into an alphanumeric 


key, which it then sends to a computer. There are numerous standards for barcodes: 


How the thickness of bars and spaces in a barcode correspond to binary digits. 
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Code 128, Code 39, Codabar, EAN (this appeared in 1976 in versions of 8 and 13 
digits) and UPC (Universal Product Code, used primarily in the US and available 
in versions of 12 and 8 digits). The most common code is the 13-digit version of 
EAN. Despite the variety in standards, the barcode allows for any product to be 


identified in any part of the world, swiftly and without a large margin of error. 


Oct. 7, 1952 N. J. WOODLAND ET AL 2,612,994 
CLASSIFYING APPARATUS AMD METHOD 
Filed dct. 26, 1640 3 Sheets-Sheot 1 
FIG, | 


EMV Eee TOs ¢ 
NORMAM J WOODLAND 
= BEANARD SILVER 
4s. ey THEIA aAtTToRNETS 
NOTE: LINES & 4 8, AND @AR® LESS : Pfourkort’ 
RerLectiy a 


THAN LINES 10. ’ Afourdore” 


The patent of Woodland and Silver's system of concentric circles that pre-dates modern barcodes. 
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EXCEL APPLICATION FOR THE CALCULATION 
OF THE CONTROL DIGIT OF THE EAN-13 CODE 


A barcode the EAN- 13 we is a number made up of 12 — plus a 13th called a control 
digit (C.D). : 
The 13 digits are distributed in four groups: 


Sum of digits in odd position 


Sum of digits in even position and the result multiplied by 3 


Sum of the two previous results geenue 
Remainder of dividing the previous result by 10 ee Te 
i) The C.D. is 0 or 10 less the previous result ae 


Company ae Product 


county oa 
PET Peete! Pesta] | +0 
SUBBESSARERSs sos = 
[Sin ot ins caavonton [| [| [|_| [eens 


Sum of digits in even position and the result multiplied by 3 a (D4+G4+/4+L4+N4+P4)*3 


8 Sum of the two previous results =D =R6+R7 


’ = Remainder of dividing the previous result 
=RESIDUO(R8; 10) 
by 10 
The C.D. is 0 or 10 less the previous 
result =S|(R9=0;0; 10-R9) 
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The EAN-13 barcode 


> 


The EAN code, originally named as the acronym for “European Article Number’ 
when it was created in 1976, is now known as the International Article Number. 
It is the most established barcode standard and is used throughout the world. EAN 
codes generally consist of 13 digits represented by black bars and white spaces that 
together form a binary code that is easy to read. EAN-13 represents these 13 digits 
by means of 30 bars and spaces. The digits are distributed in three parts: the first 
one, that consists of 2 or 3 numbers, indicates the country code; the second, made 
up of 9 or 10 numbers, identifies the company and the product; the third, of only 
one digit, acts as the control code. For a code ABCDEFGHIJKLM these parts are 


divided as follows: 


¢The first two (AB) form the code of the country of origin of the product. The 
UK’s code is 50, for example, while Ireland’s is 539. 

¢The next five (CDEFG) identify the company producing the product. 

¢The other five (HIJKL) indicate the product code that has been assigned by 
the company. 

*The last (M) is the control number. To calculate it, we have to add the digits 
in the odd positions, starting at the left and without counting the control 
number. To the resulting value we then add three times the sum of the digits in 
even positions. The control number is the value that makes the total sum just 
calculated a multiple of 10. As we can see, the barcode control system is strongly 


reminiscent of the one employed by credit cards. 


a 


Let’s verify if this barcode is valid: 
8413871003049 
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8+1+84+14+04+04+3(44+34+7404+3+4) = 18+ 3(21) = 184+ 63=81. 


The correct control digit should be 90—81=9. 


The mathematical model of the algorithm is based on modular arithmetic 


(modulus 10) as follows: 

ABCDEFGHIKLM, we will call the value of the expression N 

A+C+E+G+1+K+3(B+D+F+H+J+L=N 

and n the value of N in modulus 10.The control digit M is defined as 
M =10—n. In our example, we have that 81 = 1 (mod. 10), therefore the control digit 
will, indeed, be 10-1=9. ? | 

The previous algorithm can be formulated in an equivalent way using the control 
digit in the calculations. The following technique allows us to verify the validity of 


the control code without having to calculate it first. 
A+C+E+G+lI+K+3(B+D+F+H+ J+L)+M#=0 (mod. 10). 
For the sample code 
5701263900544 


5+0+24+34+04+54+3(7+14+6+9+0+4)+4=100. 


100 = 0 (mod. 10). 


The code is therefore valid. 


Out of curiosity, we will try to determine the value of a lost number of a barcode. 


Specifically, that represented by X in the following code: 


401332003X497 
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We arrange the numbers according to the algorithm 


4+14+3+04+3+4(04+34+2+0+X+9)+7=64+3X =0 (mod. 10). 


In modulus 10, we get the following equation: 
4+3X =0 (mod. 10). 
3X =-4+0=-4+10-:1=6 (mod. 10). 
Note that 3 has an inverse since gcd (3,10) = 1. 
We therefore find that X has to be 2. Therefore the valid code is 


4013320032497. 


QR CODES 


In 1994, the Japanese company Denso-Wave 
developed a graphic system of encryption to 
identify the parts of cars in an assembly line. The 
system, called QR for the speed with which it 
could be read by machines designed for the pur- 
pose (the initials QR stand for quick response), 
ended up expanding way beyond car factories. 
In just a few years, the majority of Japan’s mobile 
telephones could instantly read the information 


contained in the code. The QR is a type of matrix 
A QR Code of 37 rows for the 


code, formed by a variable number of black or 
University of Osaka, Japan. 


white squares that, in turn, are arranged in the 


shape of a larger square. The squares represent 


a binary value, 0 or 1, and, therefore, they operate in a very similar way to barcodes, although 


adding a second dimension gives the code a larger storage capacity. 


» 
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Chapter 5 


An Open Secret: the 
Cryptography of Public Keys 


Cryptography was not ignored during the rapid growth in computing technology. 
To use a computer to cipher a message is more or less the same process as ciphering 
without a computer, but there are three fundamental differences. First, a computer 
can be programmed to simulate the work of a conventional encoding machine of, 
for example, 1,000 rotors without the need to physically build such a device. Sec- 
ond, a computer works only with binary numbers and, therefore, all ciphering will 
occur at this level (even if the numerical information is subsequently deciphered 
into text again). And third, computers are extremely fast at computing ciphers and 
deciphering messages. 

The first ciphers designed to take advantage of the potential of computers were 
developed in the 1970s. An example is Lucifer, a cipher that divided the text into 
blocks of 64 bits and encrypted some of them by means of a complex substitution 
and then grouped them again into a newly ciphered block of bits and continued 
to repeat the process. The system required both sender and receiver to be equipped 
with a computer running the same encryption program and a shared numerical key. 
A 56-bit version of Lucifer called DES was introduced in 1976. DES, standing for 
Data Encryption Standard, is still in use today although it was cracked in 1999 and 
largely replaced by the 128-bit AES (Advanced Encryption Standard) in 2002. 

Without a doubt, this encryption made the most of a computer's processing 
power, but just like their thousand-year-old predecessors, computerized codes were 
still vulnerable to the danger that an unauthorised receiver could obtain the codes 
and, knowing the encryption algorithm, decipher the message. This basic weakness 


of every “classical” system of cryptography is known as the key distribution problem. 


The key distribution problem 


It is generally agreed that encryption keys should be protected more than the algo- 


rithm in order to maintain the security of a code. That creates a problem: how to 
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distribute keys securely. Even in the simplest cases, it could lead to difficult logisti- 
cal problems, such as how to distribute thousands of code books among a large 
army, or how to distribute them to mobile communication centres that operate 
in extreme circumstances, like submarine crews or units in the heat of battle. No 
matter how sophisticated a classical encryption system was, all were vulnerable to 


the interception of their respective keys. 


The Diffie-Hellman algorithm 


The notion of a secure exchange of keys might sound self-contradictory: how can 
you send a key as a message, which has already been encrypted — with the key ex- 
changed previously in the usual way? However, if the exchange is set up as a com- 
munication with multiple exchanges, one can imagine a solution to the problem 
— at least on a theoretical level. 

Let us suppose that a sender named James encrypts a message with his key and 
sends it to the receiver, Peter. The latter re-encrypts the ciphered message with 
his key and returns it to the sender. James deciphers the message with his key and 
sends this new message, that is now only ciphered with Peter’s key, who goes on to 
decipher it. The age-old problem of the secure exchange of keys has all of a sud- 
den been resolved! Can this really be true? Sadly, no. In any complex encryption 
algorithm, the order in which the keys are applied is critical, and we have seen that 
in our theoretical example, James has to decipher a message that has already been 


ciphered with another key. When the order of the ciphers is reversed, the result will 


THE MEN BEHIND THE ALGORITHM 


Bailey Whitfield Diffie (left) was born in 1944 in the 
United States. With a mathematics degree from the 
Massachusetts Institute of Technology (MIT), he served 
as the Chief Security Officer and Vice President of Cali- 
fornia-based Sun Microsystems from 2002 until 2009. 
For his part, the engineer Martin Hellmann was born 
in 1945 and carried out his professional career at IBM 
and MIT, where he collaborated with Diffie. 
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be gibberish. The theory is not really explained by the above scenario, but it shines 
a light on a way to solve the problem. In 1976, two young American scientists, Bai- 
ley Whitfield Diffie and Martin Hellman, found a way in which two people could 
exchange ciphered messages without having to exchange any secret key whatsoever. 
This method makes use of modular arithmetic, as well as the properties of prime 


numbers. The ideas is as follows: 


1. James picks a number that he keeps secret. We will call this number N - 
2. Peter picks another random number that he, too, keeps secret. We will call 
this number N,,,. 
3. Next, both James and Peter apply a function of the type f(x) =4* mod. p 
to their respective numbers, with p being a prime number known by both. 
¢ From this operation James obtains a new number, N 2? Which he then 
sends to Peter. 
¢ Performing the same operation, Peter obtains a new number, N p>» Which 
he sends to James. 
4. James solves an equation of the form N a ' (mod. p) and gets a new 
number, C y 
5. Peter solves an equation of the form N a (mod. p) and gets a new 


number, C,,. 


Although it appears impossible, C, and C,, are the same. And now we have the 
key. Note that the only times in which James and Peter exchanged information 
was when they agreed on the function f(x) =a* mod. p and when they sent each 
other Nj and Nez . Neither are the key and their interception, therefore, will not 
threaten the security of the encryption system. The key of this system will have the 


general form: 
a®s''%>\ in modulus p. 
It is also important to take into account that the original function has the special 
feature of not being reversible, that is, knowing both the function and the result of 


applying it to a variable x, it is impossible (or at least very difficult) to obtain the 


original variable x. 
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Next, and to emphasise the point, we will repeat the process with specific values. 


The function being used is: 


f(x) =7* (mod. 11). 


1. James picks a number , N ie for example 3, and calculates f(x)=7* (mod. 11). 
obtaining f (3) = 7° =2 (mod. 11). 

2. Peter picks anumber N,,,, for example 6, and calculates f(x)=7* (mod. 11). 
obtaining f (6) = 7° = 4 (mod. 11). 

3. James sends Peter his result, 2, and Peter does the same with his, 4. 

4. James calculates 4° = 9 (mod. 11). 

5. Peter calculates 2° = 9 (mod. 11). 


This value, 9, will be the key of the system. 


James and Peter have exchanged both the function f(x) and the numbers 2 and 
4. Is this information useful to an eavesdropper? Let us suppose that our unwanted 
recipient knows both the function and the numbers. His problem now is to solve 
Ny and Ne: in modulus 11 Nj: and Ne: being the numbers that both James and 
Peter keep secret — even from each other. If the spy manages to discover them, 
he would have the key only to solve a‘! in modulus p. The solution to this 
problem by the way, is termed a discrete logarithm in mathematics. For example, 


in the case of: 


f(x) = 3* (mod. 17) 


we can see that 3* = 15 (mod 17) and trying different values of x, we find that 
x = 6 and verify the relation 3* = 15. 

The algorithms of this type, and the problem of the discrete logarithm did not 
receive much attention until the beginning of the 1990s and it has only been in 
recent years that it has been developed. In the example above, we say that 6 is the 
discrete logarithm of 15 with a base of 3 with modulus 17. 

The special characteristic of this type of equations is, as we have already men- 
tioned, that they are difficult to reverse — they are asymmetrical. For values of p greater 
than 300 and of a greater than 100, the solution — and, therefore, the cracking of 


the key — becomes extremely difficult. 
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VIRUSES AND “BACK DOORS” 


Even the most secure public key cipher depends on the private key being kept secret. Con- 


sequently, if a computer virus infects a computer and locates and transmits this private key, 


it wreaks havoc on the encryption system. In 1998 it was discovered that a Swiss company, a 


leader in the production and sale of cryptographic products, had included “back doors” that 


detected the private keys of the users and returned them to the company. Some of this infor- 


mation was handed over to the United States government, which could thereby monitor the 


communications between the infected computers. 


This algorithm is the foundation of modern cryptography. Diffie and Hellman 


presented their idea at the National Computer Conference, in a seminar that can 
only be termed as groundbreaking. Their paper can be examined in its entirety at 
www.cs.berkeley.edu/~christos/classics/diffiehellman.pdf, where it appears with 
the title New Directions in Cryptography. 

Diffie-Hellman’s algorithm demonstrated the possibility of creating a crypto- 
graphic method that did not require the exchanging of keys yet, paradoxically, relied 
on public communication for part of the process — the initial pair of numbers that 
serve to determine the key. 

Put another way, it made it possible to have a secure encryption system between 
senders and receivers who never had to meet or agree a key in secret. However, 
certain problems remained: if James wants to send Peter a message while Peter is 
asleep, for example, he will have to wait for his opposite number to wake up to 
carry out the process of generating the key. 

In the process of trying to discover new, more effective, algorithms, Diffie 
theorised a system in which the ciphering key would be different from the 
deciphering key, and therefore one could never derive one from the other. In 
this theoretical system, the sender would have two keys: the encrypting key 
and the decrypting key. Of the two, the sender would only make the first one 
public so that whoever should want to send him a message could encrypt it. 
Having received the message, the sender would go on to decipher it using the 
decrypting key that had obviously remained secret. Would it be possible to put 


such as system into practice? 
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The primes come to the rescue: the RSA algorithm 


In August of 1977, the famous US science writer, Martin Gardner, entitled his 
column on recreational mathematics for the journal Scientific American, “A new 
kind of cipher that would take millions of years to break.” After explaining the 
principles of the public key system, he listed the ciphered message as well as the 
public key N used to create the cipher: 


N= 114.381.625.757.888.867.669.235.779.976.146. 
612.010.218.296.721.242.362.562.561.842.935.706.935.245.7353. 
897.830.597.123.563.958.705.058.989.075.147.599.290.026.879. 

543.541. 


Gardner challenged his readers to decipher the message from the information 
given, and even offered up a clue — the solution required that N be factorized into 
its prime components p and q. To top it off, Gardner promised a prize of $100 (a 
reasonable sum at the time) to whoever got the right answer first. Anyone want- 
ing more information on the cipher, Gardner wrote, should send a request to the 
cipher’s creators, Ron Rivest, Adi Shamir and Len Adelman from MIT's Laboratory 
for Information. 

The correct answer was not received until 17 years later, and to find it took 
the collaboration of more than 600 people. The keys turned out to be p = 32. 
769.132.993.266.709.549.961.988.190.834.461.413.177.642.967.992. 
942.539.798.288.533 and q = 3.490.529.510.847.650.949.147.849.619.903.898. 
133.417.764.638.493.387.843.990.820.577, and the ciphered message, “The magic 
words are squeamish ossifrage.” 

The algorithm Gardner presented is known as RSA, an acronym from the sur- 
names Rivest, Shamir and Adelman. It is the first practical implementation of the 
public key model posited by Diffie, and it is regularly used today. The security it offers 
is almost total because the deciphering process is incredibly hard work, although not 


impossible. Next, we will look at the basis of the system in simplified form. 
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The RSA algorithm in detail 


The RSA algorithm is based on certain properties of prime numbers that the in- 
terested reader can find in the Appendix. We will limit ourselves here to setting out 


the basic assumptions that underlie it. 


* The group of numbers smaller than n that are also prime with n are called 
Euler’s function and are expressed as (1). 

* If n= pq given that p and q are prime numbers, then @(n) = ( p-(q-1). 

* From “Fermat’s Little Theorem” we know that if a is a whole number larger 
than 0 and p a prime number, we have to have a’-! = 1 (mod. p). 

* According to Euler’s theorem, if gcd(n,a)=1 , then a “ =1 (mod.1). 


As mentioned, the system is described as “public key” because the encryption 
key is given to any sender interested in transmitting messages. Each recipient has his 
own public key. The messages will always be transmitted translated into numbers, 


be it as ASCII code or any other system. 


First, James generates a value n as a product of two prime numbers p and 
q(n= p-q)and we pick a value e so that thegcd( (nm), e)=1. Remember that 
P(n) =(p—1)(q—-1). The data that is made public is the value of n and the value 
of e (under no circumstances will we provide the values p and q). The pair (n,e) is 
the public key of the system, and the values p and g are known as RSA numbers. 
In parallel, James calculates the only value of din modulus @(n) that satisfies that 
d-e=1, that is, the inverse of e in modulus @(n). We know that this inverse exists 
because gcd( (nm), e)=1. This value d is the private key of the system. For his 
part, Peter uses the public key (n,e) to encrypt message M by means of the func- 
tion M =m* (mod. n). Having received the message, James carries out the opera- 
tionM* = (m*)* (mod. n), This expression is equivalent to M4 =(m*)4 =m (mod. n), 


which proves that the message can be deciphered. 

We will now apply this procedure with specific numerical values: 

If p=3 and g=11we have n= 33. @(33)=(3-1)-(11—1) = 20. James picks e that 
does not have a divisor in common with 20, for example e=7. James’s public key 


is (33,7). 
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¢ Meanwhile, James has calculated a private key d that will be the inverse of 7 
mod 20, that is 7-3 =1 (mod. 20), and therefore d= 3. 
° Peter acquires the public key and wishes to send us the message “9”. To cipher 


it, he uses James’s public key and solves: 


97 = 4.782.969 = 15 (mod. 33). 


The ciphered message is 15. Peter sends us the message. 


James receives the message 15, and deciphers it: 


153 = 3.375 =9 (mod. 33). 


The message has been correctly deciphered. 


As we pick larger prime numbers p, q, the difficulty of implementing the RSA 
algorithm increases to a point where the use of a computer for the calculation of 
the solutions becomes necessary. For example, if p= 23 and q=17,then n= 391.The 
public key that results for e=3 is (391,3). Consequently d= 235. For a plaintext 


message like 34, the deciphering operation is: 


2042 = 34 (mod. 391). 


Take note of the order of magnitude and imagine the gigantic calculation capac- 


ity necessary to find this solution. 


Why should we trust in the RSA algorithm? 


A potential spy knows the values of n and of e because they are public. To deci- 
pher the message he will need, along with the value of d, the private key. As we 
demonstrated in the preceding example, the value d is generated from n and from 
e. Where does the security stem from? Let us remember that to construct d, it is 
necessary to know @(n)=(p—1)(q—1), in particular, p and q. For this, it is “suffi- 
cient” to decompose n as a product of two prime numbers p and q. The problem 
for the spy is that to factorize a large number as a product of two prime numbers 
is a slow and laborious process. If n is sufficiently large (of the order of more than 


100 digits), there is no known way to find p and q in a reasonable amount of time. 
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Today, the prime numbers used in the ciphering of messages of the most sensitive 


nature exceed 200 digits. 


Reasonable privacy 


The RSA algorithm consumes a great deal of computing time and requires high- 
powered processors. Until the 1980s, only governments, the military, and large 
enterprises had sufficiently powerful computers to work with RSA. As a result, 
they enjoyed a de facto monopoly over effective encryption. In the summer of 1991, 
Philip Zimmermann, an American physicist and activist for privacy, offered free of 
charge the PGP (Pretty Good Privacy) system, an encryption algorithm capable of 
working on home computers. PGP employs the classic symmetrical codification 
~ which gives it greater speed on home computers — but it ciphers the keys with 
an asymmetrical RSA. 

Zimmermann explained the reasons for this measure in an open letter that 
deserves to be quoted here, at least partially, for its prescient description of the way 


we live, work and communicate two decades later. 


“It’s personal. It’s private. And it’s no one’s business but yours. You may be plan- 
ning a political campaign, discussing your taxes, or having a secret romance. Or 
you may be doing something that you feel shouldn’t be illegal, but is. Whatever 
it is, you don’t want your private electronic mail or confidential documents read 
by anyone else. There’s nothing wrong with asserting your privacy. Privacy is as 


apple-pie as the Constitution... 


“We are moving toward a future when the nation will be crisscrossed with 
high capacity fibre-optic data networks linking together all our increasingly 
ubiquitous personal computers. E-mail will be the norm for everyone, not the 
novelty it is today. The government will protect our E-mails with Government- 
designed encryption protocols. Probably most people will acquiesce to that. 
But perhaps some people will prefer their own protective measures... If privacy 


is outlawed, only outlaws will have privacy. 


Intelligence agencies have access to good cryptographic technology. So do 
the big arms and drug traffickers. So do defense contractors, oil companies, 


and other corporate giants. But ordinary people and grassroots political or- 
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_ SECURITY FOR EVERYONE 


Philip Zimmermann, bork in 1954, is an American 
physicist and software engineer who has. spearheaded 
- movement that aims to make modern cryptography 
> siatable to all. Besides launching the PGP system, in 
2006 he created Zfone, a software program for secure 
“voice cornmunication over the internet, and he is presi- 


“dent of the Open PGP Alliance, a lobby group in favour 


_ of open code software. 


ganizations mostly have not had access to affordable military-grade public-key 


cryptographic technology. Until now. 


PGP empowers people to take their privacy into their own hands. There's a 


growing social need for it. That’s why I wrote it.” 


From Zimmermann’ reflections, we can see that the price of living during the 
information age is to have our traditional notions of privacy threatened. Conse- 
quently, a good understanding of the codification and encryption mechanisms that 
surround us would not only make us wiser, but could also prove to be enormously 
useful when it comes to protecting what is valuable to us. 

The use of PGP has been spreading since its creation and constitutes the most 


important private cryptography tool available today. 


Authentication of messages and keys 


The different systems of public key encryption — or public and private keys com- 
bined, like PGP — ensure a high level of confidentiality in the transmission of 
information. However, the security of a complex communication system like the 
Internet does not reside solely in confidentiality. 

Before the arrival of modern communication technologies, the vast majority 
of messages originated from known sources, such as family, friends, or a handful 
of professional relationships. Today, however, each individual is bombarded by an 


avalanche of communications from a myriad of sources. The authenticity of these 
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communications can frequently be impossible to determine just by reading them, 
with all the problems that derive from that. For example, how can we prevent 
someone falsifying the address of origin of an email? 

Diffie and Hellman themselves proposed an ingenious way to use public key 
encryption to authenticate the origin of a message. In a cryptography system of 
this type, the sender ciphers the message with the public key of the receiver, who 
in turn uses his own private key to decipher the message. Diffie and Hellman no- 
ticed that RSA and other similar algorithms displayed an interesting symmetry. The 
private key could also be used to cipher a message, and the public to decipher it. 
This operation provides no security whatsoever — the public key is easily available 
to everyone — but it does assure the receiver that the message comes from a specific 
sender, the owner of the private key. To authenticate the sender of a message it is 
sufficient, in theory, to add an additional encryption to the normal one with the 


following process: 


1. The sender encrypts a message with the receiver's public key. This first step 
ensures confidentiality. 

2.The sender again encrypts the message, this time with his private key. In this 
way the message is authenticated or “signed”. 

3. The recipient uses the sender's public key to undo the encryption of step 2. 
Thus the origin of the message is verified. 

4.The receiver now uses his private key to undo the encryption of step 1. 


Hash functions 


One of the problems with the theoretical outline above is that the encryption of 
the public key requires a considerable computational capacity and to repeat the 
process for the purpose of signing and verifying every message would be extremely 
time consuming. That is why, in practice, the signing of a message is carried out by 
mathematical resources known as hash functions. Starting with the original mes- 
sage, these algorithms generate a simple chain of bits (usually 160), called hash, and 
they do it in such a way that the probability that different messages are associated 
with the same hash is almost zero. Also, it is practically impossible to undo the 
process and obtain the original message when only starting with its hash. The hash 
of any message is encrypted by the sender with his private key and it is sent along 


with the ciphered message in the conventional manner. The receiver decrypts the 
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message that contains the hash with the sender’s public key. Next, and given that he 
knows the hash function used by the sender, he applies that function to the message 
and compares the two hashes. If they match, the identity of the sender is verified 


and, moreover, it is certain that no one else has handled the original message. 


Function 
Message Hash 


RED DKJD 1242 AACB 788B 761A 
696C 24D9 7009 CA99 2D17 


THE COLOUR RED 
CORRESPONDS 
TO THE LOWEST FREQUENCY 


0896 56BB ZC7D CBE2 823C 
ADD7 8CD1 9AB2 JJ6J SABC 


THE COLOUR RED 
CORRESPONDS 
TO THE LOWEST FREQUENCY 


FCD3 7FDB D588 4C75 4BF4 
1799 7D88 ACDE 92B9 6A6C 


THE COLOUR RED 
CORRESPONDS 
TO THE LOWEST FREQUENCY 


D401 COA9 7D9A 46AF FB45 
76B1 79A9 ODA4 AEFE 4819 


Tiny changes in the content of the message generate totally different “hashes”. In this way, the 
receiver can be sure that the text has not been manipulated 


Certificates of public keys 


However, the most important problem to be confronted in a public key cryptog- 
raphy system is found, not in the authentication of the messages, but rather in that 
of the public keys themselves. How do the sender and the recipient know that the 
public key of the other is valid? Let us suppose that a spy deceives the sender by 
giving him his own public key while making him believe that it is the receiver's 
key. If the spy manages to intercept a message, he can now use his private key to 
decrypt it. To avoid being discovered, the spy uses the public key of the receiver to 
re-encrypt the message and send it to its original destination. 


This is why there are both public and private institutions devoted to the inde- 
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pendent certification of public keys. A certificate of this type contains, besides the 
corresponding key, information on the receiver and an expiration date. The holders 
of these types of keys make their certificates public, and they can now be used and 


exchanged with a certain degree of security. 


DIGITAL STEGANOGRAPHY 


Although it may appear paradoxical, the development of the new technologies has generated 
a revival of steganography. A conventional audio file consists of values of 16 bits reproduced 
at a rate of 44.1 kHz. It is very simple to use some of these bits to transmit a secret message 
without the listener perceiving any ecoustic mia desta whatsoever. Image files can ase be 


used to transmit hidden information. 


An example of digital steganography: the number pi to four decimal places is hidden in a 
tiny fragment of a larger image. On the left, the photo, apparently normal, and on the right, 
the pixels extracted from one small area that conceals the number 3.1415. 
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But is it safe to buy on the Internet? 


Most on-line spies and hackers have little interest in the messages exchanged by 
ordinary people, with one notable exception: the numbers of their credit cards. 
The cryptography system that protects the transmission of such a sensitive piece of 
information (or “layer” in information science jargon) is known as TLS (Transport 
Layer Security). It was developed by the Internet software corporation Netscape in 
1994 and was adopted as the global standard two years later. 

The TLS protocol combines public and symmetrical keys in a rather complex 
process that is presented here in summary form. First, the web browser of the online 
purchaser verifies that the online seller has a valid public key certificate. If so, he 
uses this public key to encrypt a second key, this one symmetrical, that he sends to 
the seller. The seller then uses his private key to decrypt the message and get the 
symmetrical key, which will be the one used to cipher the all the information be- 
ing processed. As a consequence, to acquire the credit card number in any online 


transaction, the spy will have to penetrate not one, but two encryption systems. 
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Chapter 6 
A Quantum Future 


According to Philip Zimmermann (see Security for Everyone, page 108) in Simon 
Singh’s book The Code Book, “In modern cryptography, it is possible to create 
ciphers that really are beyond the reach of all the known forms of cryptanalysis.” As 
we have noted, to break encryption algorithms like RSA or DES and even mixed 
systems like PGP by brute force is beyond the computing capacity of the fastest 
of computers. Is it conceivable that some type of mathematical short cut could 
allow future spies to reduce the complexity of cryptanalysis? Although this 
possibility cannot be discarded, no one considers it very probable. 

Is Zimmermann right? Has the thousand-year-old conflict between cryptogra- 


phers and cryptanalysts been resolved? 


Quantum computing 


The answer is not exactly. In the last decades of the 20th century, quantum com- 
puting, a new and revolutionary way of designing and operating computers ap- 
peared. Although still only at the theoretical stage, a quantum computer could have 
the calculating power to decipher today’s encryption algorithms by trial and error. 
Cryptanalysis may be back in the game one day. 

This embryonic technological revolution is based on quantum mechanics, a theo- 
retical edifice erected at the start of the last century by scientists including the Dane, 
Niels Bohr (1885-1962), the Briton Paul Dirac (1902-1984), and the Germans 
Max Planck (1858-1947), Werner Heisenberg (1901-1976) and Erwin Schrédinger 
(1887-1961), among many others. The vision of the universe postulated by quantum 
mechanics is so profoundly counter-intuitive that Albert Einstein was famously 
quoted in opposition to it: “God does not play dice.” Despite Einstein’s reserva- 
tions, the theory of quantum mechanics has been successfully tested on countless 
occasions, and its validity is now beyond question. The scientific community as a 
whole assumes that at the macroscopic level — that is, the universe of the stars, of 
houses and of molecules — the universe follows the laws of classical physics. However, 


in the quantum world — the impossibly small realm of subatomic particles such as 
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quarks, photons, electrons, etc, a different set of rules apply leading to astounding 
paradoxes. Without this theory, there would no such thing as nuclear reactors nor 


laser readers. There would be no way to explain the brilliance of the sun or the 


functioning of DNA. 


Niels Bohr (above left) with Max Planck, two fathers of quantum physics, 
in a photograph taken in 1930. 


The cat that was neither dead nor alive 


In a quantum physics seminar held in 1958, Bohr gave his opinion on the proposi- 
tion of one of the speakers as follows: “We all agree that his theory is crazy. The 
question that divides us is whether it is crazy enough that it could have a chance 
of being correct.” How crazy is quantum mechanics, really? By way of example, 
let’s take the principle of the superposition of states. A particle presents a super- 
position of states when it occupies more than one position at the same time, or 
when it simultaneously possesses different quantities of energy. However, when 


an observer measures the particle it will always be seen to adopt one position or 
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to possess a specific quantity of energy. Schrédinger himself devised a thought 
experiment, “Schrédinger’s cat,” to illustrate this apparently ridiculous notion. Im- 
agine a cat is placed in a sealed, opaque box. Inside the box there is also a flask 
containing a noxious gas that is connected by some device to a radioactive particle 
so that, if the particle decays, the gas escapes from the container, and the cat is 
poisoned. The particle in question has a 50% probability of decaying during a de- 
termined period of time. The whole set up, depending as it does on the behaviour 


of a single particle, is subject to the laws of quantum physics. 


Schrédinger’s cat is a thought experiment that illustrates the quantum theory 
concept of the superposition of states. 


Let us suppose that the determined period of time has passed. The question is: 
Is the cat alive or dead? Or, in the jargon of quantum mechanics, what is the state 
of the box-cat-system? The answer to the question is that, until the observer opens 
the box and “measures” the state of the system, the particle may or may not have 
disintegrated and, therefore, there is a system of superposed states: the cat is, strictly 
speaking, neither alive nor dead, but both simultaneously. 

For all those who consider the superposition of states to be a far-fetched 
hypothesis, it is important to note that alternate interpretations have been proposed 
by respected physicists. For example, the theory known as interpretation of possible 


worlds maintains that the notion of the superposition of states is an unsustainable 
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thesis and that what occurs in reality is that, for each of the possible states a particle 
may find itself in — position, quantity of energy, etc. — there exists an alternative 
universe where the particle adopts one specific state. In other words, in one universe 
the cat in the box is alive, and in another, dead. When the observer opens the box 
and verifies that our feline friend is in fact alive, he does so as an integral part of 
only one of the possible universes. In another parallel universe — complete with its 
own stars, planets, train stations and ants — this same observer looks inside the box 
and verifies, undoubtedly with some sadness, that the cat has succumbed to the 
deadly poison. The supporters of the interpretation of possible worlds still haven't 
clarified how these universes interact with each other. Even so, what the theory 
shows is that it is the interpretation of why quantum reality behaves in this way 
that is in question, not the behaviour itself, which has been confirmed in numerous 


conclusive experiments. 


From bit to qubit 


What, however, is the relation between the superposition of states of particles 
and computation — let alone cryptography? Until 1984 nobody would have even 
thought to propose a relationship between the two fields. Around. that time, the 
British physicist, David Deutsch, began to throw around a revolutionary idea: what 
would computers be like if, instead of submitting to the laws of classical physics, 
they obeyed instead those of quantum mechanics? How could computing benefit 
from the superposition of states of particles? 

Let us recall that conventional computers handle minimal units of information, 
called bits, capable of assuming opposing values: 0 and 1.A quantum computer, on 
the other hand, could take as its smallest unit of information a particle that presents 
two possible states. For example, the spin of an electron can only be in one of two 
directions, up or down.This particle would have the fantastic property of represent- 
ing the value 0 (spin down) or the value 1 (spin up). Through the superposition of 
spin states, it could represent the two values simultaneously. This new unit of infor- 
mation has been called a qubit, a contraction of quantum bit, and its manipulation 
could opens the doors to a world of super-powerful computers. 

A conventional computer performs its calculations sequentially. Let us take as 
an example the numeric information contained in 32 bits. With this number of 
bits, we can encrypt numbers from 0 to 4,292,967,295. Ifa conventional computer 


had to find a specific number within that group, it would have to do so bit-by-bit. 
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However, a quantum computer could perform the task much faster. To illustrate 
how, imagine we can put 32 electrons in a special container and make them enter 
a superposition of states. Then, by applying electric impulses strong enough to 
change the spin of an electron from up to down, these 32 electrons — the qubits of 
our quantum computer — would represent all the possible combinations of spin up 
(1) and spin down (0) simultaneously. As a result, the search for the desired number 
would be performed on each and every one of the possible options all at once. If 
we increase the quantity of qubits to, say, 250, the number of simultaneous opera- 
tions that could be performed would be about 10”, a little more than the number 
of atoms that our universe is thought to contain. 

The work of Deutsch proved that quantum computers were a theoretical pos- 
sibility. That they become a practical reality one day is the objective of dozens of 
institutions and research groups throughout the world. So far, however, they have not 
been capable of overcoming the technical difficulties of building a viable quantum 
computer. Some experts believe it will take another 15 or 25 years to achieve it; 


others doubt that it is even possible. 


_ABIG BROTHER FOR THE 21s CENTURY 


"The consequences of building a viable quantum computer would not just be limited to the 
: collapse of cryptography as we know it. Such a calculating power placed at the service of any 
- interest, be it public or private, could shift the balance of world power. The battle to be the first — 
country to develop such a technology could easily end up being the next technological race, 
| emulating the space and é arms races of the last half of the twentieth century. It is not Outrageous - 
~ to think that decisive advances in this field might best be kept secret for reasons of national — 
- security. Is there somewhere in the world, ina refrigerated underground tunnel, a quantum | 


: computer waiting * to be placed ini full operation § to © change ¢ our lives forever? 
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GOODBYE, DES, GOODBYE 


Two years after Shor demonstrated that a quantum computer could conquer the RSA cipher, 


another American, Lov Grover, did the same with another mainstay of modern cryptography, 
the DES algorithm. Grover designed a program that allowed a quantum computer to find the 
correct numeric value from a list of possible values in a time that is the square root of the time it 
would have taken a conventional computer. Another commonly used algorithm that would be 


affected by this innovation is the RC5, the standard used by Microsoft's web browsers. 


The end of cryptography? 


Quantum computing would lead to the death of cryptography as we know it. Let’s 
take as an example the star of modern encryption algorithms, RSA. As we recall, 
whoever tries to crack an RSA code by brute force will have to successfully factor- 
ise the product of two very large prime numbers. This operation is extraordinarily 
laborious and so far no mathematical shortcut has been found to solving it. Could 
a quantum computer take on the challenge of factoring a prime number of the 
size handled by RSA codes? Peter Shor, the American computer scientist, answered 
affirmatively in 1994. Shor designed an algorithm executable by a quantum com- 
puter, and capable of breaking down enormous numbers in infinitely less time than 
a more powerful conventional computer. 

If this astounding device were ever to be constructed, Shor’s algorithm would 
demolish, piece by piece, the powerful cryptographic infrastructure built around 
RSA and, overnight, the most secret information on the planet would be exposed to 
the light of day. All contemporary encryption systems would follow the same path. 
Paraphrasing Mark Twain, we could say that the reports of the death of cryptanalysis 


have been “greatly exaggerated.” 


What quantum mechanics takes away, 
quantum mechanics gives back 


One of the foundations of quantum mechanics is called the Uncertainty Prin- 
ciple, elucidated by Werner Heisenberg in 1927. Although its exact formulation 


is very technical, its own creator dared to summarise it as follows: “In principle 
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we cannot know the present in all its detail’? More concretely, it is impossible 
to determine with any degree of accuracy certain complementary properties of 
a particle at any given moment. Let us take, for example, the case of light par- 
ticles (photons). One of their fundamental characteristics is their polarisation, a 
technical term that refers to the oscillation or vibration of an electromagnetic 
wave. [Although photons vibrate in all directions, for the purpose of this brief 
exposition we will assume that they vibrate in four: vertical (D, horizontal (<9), 
diagonal to the left (K,) diagonal to the right (,7).] Well, then, Heisenberg’s prin- 
ciple states that the only way to verify the polarisation of a particular photon is by 
passing it through a filter or “slit” that in turn can be either horizontal, vertical, or 
diagonal to the left or right. The photons polarised horizontally will pass the hori- 
zontal filter unchanged, while those that are polarised vertically will be blocked. 
As for the photons that are diagonally polarised, half of them will pass through the 
filter with their polarisation changed to horizontal, and the other half will rebound, 
completely at random. Furthermore, once a photon is emitted from the filter, it is 


not possible to know with certainty what its original polarisation was. 


If we pass a series of photons of different polarisations through a horizontal filter, we see that half 
of those oriented diagonally pass through the filter with their polarisation changed to horizontal. 


What is the relationship between the polarisation of photons and cryptography? 
Very substantial, as we shall see below. To begin with, we will assume the role of a 
researcher who wants to know the polarisation of a series of photons. To do this he 
has no other option but to select a filter with a fixed orientation, such as horizontal. 
Let’s suppose that a photon passes through the filter. What information does the 
researcher get from this? Of course, he can assume that the original polarisation of 
the photon was not vertical. Can he make any other assumption? No. At first one 


could think that there are more probabilities that the original photon was oriented 
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horizontally than diagonally, because half of the diagonals would not pass through 
the filter. However, the number of diagonally oriented photons is also double the 
number that are horizontal. It is important to emphasise that the difficulty in detect- 
ing the polarisation of a photon is not the result of some technological or theoretical 
shortcoming capable of being rectified in the future; it is a consequence of the nature 
of subatomic reality itself. If appropriately exploited, this property can be used to 
build a completely unbreakable code, the Holy Grail of cryptography. 


The indecipherable cipher 


In 1984, the American Charles Bennett and the Canadian Gilles Brassard con- 
ceived the idea of an encryption system based on the transmission of polarised 
photons. The first step consists of the sender and the receiver agreeing on a method 
to assign a 0 ora 1 to one polarisation or another. In the example here, the assign- 
ment of 0 and 1 will be a function of two diagrams or bases of polarisation: the 
first base, called rectilinear and represented by the symbol + , maps the 1 to the 
polarisation [, and the 0 to the polarisation ¢4; the second base, called diagonal 
and represented by the symbol X, assigns a 1 to the polarisation “ and the 0 to 
X. . For example, the message 0100101011 could be transmitted as follows: 


TmmteoT+tele]+fe[*[o] [7] 


If a spy intercepts the transmission, he would need to use a filter with a fixed 


X orientation: 


Original 
message 


Detected 
polarisation 
Possible KR, or 
N\ N\ 
message 15. to Nor Z| Nor ZN or L Zt re tt WO | Rte, Nord 
=o Sti tek. ati tt tt 
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As we can see, not knowing the original base, the spy is unable to get any relevant 
information whatsoever from the polarisation detected. Even knowing the scheme 
of assigning 0 and 1 used by the sender and the receiver, if the former alternates the 
bases in a random fashion, the spy will be mistaken approximately one-third of the 
time (the table shows a breakdown of all the sending and receiving combinations 
possible under the described conditions). However, there is a glaringly obvious 
problem: the receiver is in no better a position than the spy. 

Having reached this point, the sender and the recipient could get round the 
problem by sending each other the sequence of bases used through some secure 
medium, such as ciphering with RSA. But then the security of the cipher would 
be at risk from those hypothetical quantum computers. 

To overcome this last obstacle, Brassard and Bennett had to add another sub- 
tlety to their method. If the reader recalls, the Achilles’ heel of the polyalphabetic 
ciphers of the family of De Vigenére’s square was that the use of short, repeated keys 
generated a regularity in the cipher that created a small but significant opportunity 
for the cryptanalyst. What would happen, however, if the key used were a random 
string of characters longer than the message? And what if, for greater security, every 
message, however insignificant, were ciphered with a different key? The answer is 
that we would have an unbreakable cipher. 

The first person to suggest the use of the polyalphabetic cipher with a unique 
key was Joseph Mauborgne. Shortly after World War I, when he was the chief signals 
officer for the American cryptographic service, Mauborgne imagined a notepad of 
keys composed of random series of more than 100 characters each, that would be 
given to the sender and the receiver with instruction to destroy the key used on each 
occasion and to move on to the following one. This system, known as the one-time 
pad, is, as we said, unbreakable, and can be demonstrated as such mathematically. In 
fact, top secret communications between some heads of state are carried out with 
this method. 

If the cipher of the one-time pad is so secure, why hasn’t its use spread? Why 
are we worrying about the power of quantum computers and even mentioning the 
manipulation of photons? 

Leaving aside the logistical difficulties of generating thousands of random single- 
use keys to cipher the same number of messages, the cipher on the one-time pad 
presents the same weakness as the other classical encryption algorithms: key distribu- 


tion, just the thing that modern cryptography has been so eager to resolve. 
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Base of! Bitof | Thesender| Detector | Is the The | Bit of the | Is the bit of 
the sender| the sender | transmits mle eden peal receptor |the recepto 


correct? 


However, the transmission of information by polarised photons is the perfect 
channel for submitting a unique key without danger. For this to occur, three steps 
prior to transmitting the message are necessary: 

1. First, the despatcher sends the receiver a random sequence of 1 and 0 by 

means of different, equally random, filters of vertical (p), horizontal (<€>) and 

diagonal (Ry , |“) alignment. 

2.The receiver goes on to measure the polarisation of the received photons by 

the random alternation of rectilinear bases (+) and diagonal bases (X). Since he 

does not know the sequence of filters used by the sender, a large part of the 
sequence of 0 and 1 will also be wrong. 

3. Finally, the sender and the receiver make contact in whatever manner they 

prefer, without needing to worry that it is an insecure channel, and they ex- 

change the following information: first, the sender explains what base, rectilin- 
ear or diagonal, must be employed to read each photon correctly, but without 


revealing its polarisation (that is, the filter used). For his part, the receiver tells 
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BABEL'S MESSAGE 


The Argentinian writer, Jorge Luis Borges, imagined in the short story The Library of Babel, a 
‘library so vast that its shelves contained all possible books: every novel, poem and thesis, and 
the refutations of these theses, and the refutations of the refutations, and so on to infinity. 
A cryptanalyst trying to decipher by trial and error a message ciphered with a one-time pad 
would meet with a similar situation. Since the cipher is completely random, the possible de- 
. cryptions would contain all possible texts of the same length: the real message, and a (brief) 
refutation of the message, and the same Ea with all the pope nouns exchanged for 


others of the same length, and so on to infinity. 


him in what cases he got the base selection right. As we can see in the preced- 
ing table, if a sender and a receiver get their respective bases right, we can be 
certain that the transmission of 0 and 1 has been completed correctly. Lastly, and 
in private, they each throw away the bits that correspond to the photons that 


the receiver identified with the mistaken base. 


The result of this process is that the sender and the recipient now share a 
sequence of 1 and 0 generated completely randomly: the selection of the polarisa- 
tion filters used by the sender is random, as is the selection of bases used by the 
receiver. A modest twelve bit example of the process described above appears in 


the following drawing: 


[aisothesender Jolt [1 ]o]+ [1] 1 Jofoolo[t 
ea ee | 
Sec tees eae 
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Take note of the fact that of the bits finally retained, some are discarded even 
though they were correctly interpreted. This is done because the recipient cannot 
be certain of having detected them correctly, having used the wrong bases. If the 
initial transmission consists of a sufficient number of photons, the sequence of 1 
and (0) will be long enough to constitute a one-time pad cipher capable of ciphering 
messages of a normal length. 

Let us now put ourselves in the place of a spy who has intercepted both the sent 
photons and the public conversations of the sender and the receiver. We have already 
seen that, without knowing exactly what polarisation filter was used by the sender 
of the message, it is impossible to determine when we have detected the correct 
polarisation. Nor is the information exchanged by the sender and the receiver of 
any help, because they never transmit information on the specific polarisations. 

What is even more frustrating to the spy, when not having hit upon the correct 
base and therefore having altered the polarisation of the photon, his intrusion will 
be revealed — and there is nothing he can do to stay undetected. It is enough for 
a sender and a receiver to verify a sufficiently long part of the key to detect any 
manipulation of the polarisation of the photons by an eavesdropper. 

To this end, the sender and the recipient agree on a very simple verification 
protocol. Having completed the three preliminary steps specified above, and with 
enough retained bits, the sender makes contact with the receiver, again by some 
conventional medium, and together they check a group (let’s say 100) of bits chosen 
at random from the total. If the 100 match, both the sender and the receiver can 
be completely certain that no spy has snooped on the transmission, and they can 
consider the sequence to be a good one-time cipher. Otherwise, the sender and the 


receiver have to start the process all over again. 


32cm of absolute secrecy 


Brassard’s and Bennett’s method is impeccable from a theoretical point of view, but 
when the theory was eventually put into practice, it was met with a great deal of 
scepticism. In 1989, following more than a year of hard work, Bennett fine-tuned a 
system consisting of two computers separated by a distance of 32 centimetres, one 
of which would act as the sender, and the other as the receiver. After several hours 
of trials and adjustments, the experiment was declared a success. The sender and the 
receiver completed all the stages of the process, and were even able to verify their 


respective ciphers. Quantum cryptography was possible. 
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Bennett's historic experiment had the obvious flaw of only sending secrets less 
than the length of a pace — a whisper would have probably been just as effective. 
However, in following years, other research teams increased the reach of the trans- 
mission. In 1995, researchers at the University of Geneva used an optical fibre to 
send messages 23 km. In 2006, a team from the Los Alamos National Lab in the 
United States, reached 107 km with the same procedure. Although they are not yet 
of a sufficient distance to be useful in conventional communications, they can be 
employed on small scales in places where the utmost secrecy 1s paramount, such as 
government buildings and company headquarters. 

Leaving aside considerations relating to the physical restriction of sending mes- 
sages, there is no possibility that the transmission be sabotaged, even at the quantum 
level. This quantum code represents the final triumph of secrecy over indiscretion, 
of cryptography over cryptanalysis. All we have to concern ourselves with now 
— not a minor issue by any means — is how to apply this powerful tool and who 
will get the benefit. | 
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Appendix 


Various classic ciphers — and a hidden treasure 


Below, we will set out various classic cryptographic ciphers mentioned in the main 
chapters, but not developed in depth there. All of them are representative of a va- 
riety of cryptographic techniques, or are interesting simply as diversions. We end 
the selection of classical ciphers with a fictional decryption by the American writer 


Edgar Allen Poe, which illustrates frequency analysis perfectly. 


Polybius’s cipher 


This cipher, one of the oldest for which we have detailed information, is based on 
selecting five letters of the alphabet to act as the row and column headings outside 
of a five-by-five grid, and then filling the grid with the letters of the alphabet. The 
cipher consists of having each letter correspond to the pair of letters indicated by 
the rows and the columns of the table. Originally the Greek alphabet of 24 letters 
was used, so I and J of the English 26-letter alphabet are usually combined (see 
grid below, which, for simplicity, uses A-E as the headings). The grid is filled in an 
order agreed upon by the sender and the receiver. Now let’s examine the follow- 


ing table: 


Note that the ciphered alphabet has to be 25 letters (5 x 5). The ciphered alphabet 
can also be organised according to numeric values (for example, the numbers 1, 2, 
3, 4 and 5). In that case the table could be: 
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Let’s look at an example of Polybius’s cipher using the two versions. The plaintext 
message is “BLANKS.” From the first table we get: 


B will be substituted by the pair AB. 
L will be substituted by the pair CA. 
A will be substituted by the pair AA. 
N will be substituted by the pair CC. 
K will be substituted by the pair BE. 
S will be substituted by the pair DC. 


The ciphered message is “ABCAAACCBEDC.” If we use the numeric version, 
from an analogous process, we get: 123111332543. 


Gronsfeld’s cipher 


This cipher, invented by the Dutchman Jost Maximilian Bronckhorst, the Count 
of Gronsfeld, was used in Europe in the 17th century. It is a polyalphabetic cipher, 
analogous to De Vigenére’s square, but less difficult (and secure). To encrypt a mes- 


sage we start with the following table: 


AxdanmzZ2rrITON 
moe) eee OS OF re 
en ML) SO MD mar Meme Se ae ae 
PS oD OR a om 
ee ae pe ee a es ee a 
Sty 6 SOM Oe ee 
he TE eA A OR PR ee 
PR ER OS A ee 
BG ON Sie a: CL ky CD ee 
ie Rs a te eae 
Ce RR OR IN en Se 
7 ee ee A Re a ee 
CO ee UE RR a ee RS 
om oe Ghty De eS ae Bo 
naz TvVv—wWwxXx<xe an 
HOA-maA«~K~ SCAHaA 
CN a ET so CMs « AAS oe ES ai AR a 
ee MN ME iL: ie nat <1 ee 
ae ae a oe 
DC ee GVO ee oe 
A TI eh Swe SI ON ee 
Pe Ck CIR em RR SS ee 
ee a i ee! ED 
a ee PRN en Ie on 
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Next, we select a number randomly from 0—9 to replace each letter in the mes- 
sage we wish to cipher. If the plaintext is “MATHEMATICAL”, we would pick 12 
numbers randomly, for example: 1, 2, 3, 4,5, 6, 7,8, 9, 0, 1, 2. This string of numbers 
will be the key of the cipher. Next we substitute each letter of the message by the let- 


ter corresponding to the row number in the reference table (see opposite page). 


[wesone Tw Ta Tr Te Te Tw Ta Trt Te Tat 
Mit iets Se ee SEMAN RME ER  e 
consemgeee eeda tiles ee 


M is ciphered as P (taken from the letter on row 1 of the M column), and so on. 
The whole message is ciphered as PPASRDTQKEDQ The letter A of the message 
is ciphered as ET, and D.As is the general case of polyalphabetic ciphers, this encryp- 


tion system is resistant to brute force and frequency analysis. The number of keys 
in a Gronsfeld’s Cipher for an alphabet of 26 letters is 26! x 10 = 4.03 x 107° keys. 


The Playfair cipher 


The creators of this cipher, Baron Lyon Playfair and Sir Charles Wheatstone (also 
the pioneer of the electric telegraph) were friends and neighbours, and shared a 
love of cryptography. The method is reminiscent of an illustrious antecedent, 


Polybius’s cipher, and also employs a table of five rows and five columns. In a first 


step, each character of the plaintext is substituted by a pair of letters according to 
a cipher of 5 different letters. In our example, the cipher will be JAMES. In the 


case of a 26 character alphabet, we generate the following cipher table: 


Next, the plaintext message is divided into pairs of letters or digraphs. The two 
letters of all the digraphs have to be different, and to avoid potential coincidences 
we use the letter X. We also use this letter to complete a digraph in case the last 


letter is alone. 
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For example, for the plaintext message “TRILL”, the digraph division is: 
Tete tz 
The word “TOY” is broken down as: 
TO Yx. 


Once we have the plaintext message in digraph form, we can begin to cipher 


it, taking into account three prerequisites: 


a) That the two letters of the digraph are in the same row. 
b) That the two letters of the digraph are in the same column 
c) None of the above. 


In the case of (a), the characters of the digraph are replaced by the letter located 


to the right of each one (the “next one” in the natural order of the table). In this 


way, the pair JE is ciphered as AS: 
eee es a ee ee ee eee 


In the case of (b), the characters of the digraph are replaced by the letter that is 


located immediately below in the table. For example, the digraph ET is ciphered 
as FY, and TY as YE: 


In the case of (c), to cipher the first letter of the digraph we look at its row until 
we reach the column that contains the second letter; the cipher of the plaintext is 
that found at the intersection of the two. To cipher the second letter, we look at 
its row until we reach the column that contains the first letter; the cipher of the 


plaintext is, again, that found at the intersection. 
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For example, in the digraph CO, the C is ciphered as G and the O as an I or 
a K. 


To cipher the message “TEA” with the keyword JAMES we continue as 


follows: 


¢ We express it in digraph: TE Ax. 
¢ The T is ciphered with a Y. 

¢ The E as an EF 

¢ The A as an M. 

¢ The X as a W. 


The ciphered message is “YFMW”. 


The cryptogram of The Gold-Bug 
William Legrand, the protagonist of The Gold-Bug (1843), by Edgar Allan Poe, 


discovers where a fabulous treasure is buried after deciphering a cryptogram writ- 
ten on a piece of parchment. The procedure followed by Legrand is a statistical 
method based on the frequency of the appearance of the letters that comprise an 


English text. The ciphered message is as follows: 


53 t11305))6*;4826)4t.)4t):806*:484+8960))85 
1G: 1*8783(88)5*4546(;88*96*?;8)*t(-485); 
5*+2:*$(;4956*2(5*4)898*:4069285):)6+8)4¢ 


£51($9;48081;58:841;48485; 4)4857528806*8 1( 
£9348;(88;4(1234;48)41;161;:188:27: 


Legrand starts with the assumption that the original text was written in English. 


The letter that occurs most frequently in English is e. Next, and in order of appear- 


131 


APPENDIX 


ance from most to least frequent, we have the letters: a, 0, i, d,h,n,14,5,t,u,y6f 8, 
l,m, w, bk, p, 4, %, 2. 
Our hero creates a table from the cryptogram. In the first row, the characters of 


the ciphered message, and in the second, the frequency of their appearances. 


BeEL RSE Seo ce Esse kee Tore se Reese SS 
33] 26] 19} 16] 16] 3] 2] fiofajeje|s}s}a}a}3i2ir 


Therefore, 8 is very probably the letter e. Next he looks for appearances of the 


trio of characters the, also very common, which allows him to translate the char- 
acters ;, 4, and 8. 

The appearance of the term “;(88”, now that he knows that it represents “t(ee” 
lets him deduce that the missing term ( can only be an r given that tree is the 
best possibility in the dictionary. Finally, thanks to similar ingenious cryptanalytic 
techniques and with a great deal of patience, he arrives at the following ciphered 


partial alphabet: 


That is enough to decipher the message: 


“A good glass in the bishop’s hostel in the devil’s seat 
forty-one degrees and thirteen minutes northeast and by north 
main branch seventh limb east side shoot from the left eye of the death’s-head 


a bee line from the tree through the shot fifty feet out.” 
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Prime numbers and their value in cryptography 


Real mathematics has no effects on war. No one has yet discovered 
any warlike purpose to be served by the theory of numbers. 
Godfrey H. Hardy, A Mathematician’s Apology (1940) 


In order to decrypt a message, it is essential that the cipher have an inverse. As we 
have already observed in the study of affine codes, a way to guarantee this property 
is to work with a prime number modulus. Moreover, the product of prime numbers 
constitutes an irreversible function; that is to say, once the multiplication has been 
performed, it is a very laborious task to ascertain the value of the original factors. 

This property makes this operation a very useful tool for systems based on asym- 
metrical keys, like the RSA algorithm, that in turn constitute the basis for public 
key cryptography. Here is a more detailed look at the overlap between prime 
numbers and cryptography, and we will demonstrate what we learn through the 


formal mathematical operation of RSA. 


Prime numbers and the “other” Fermat theorem 


Prime numbers as a group are a subset of the natural numbers that includes all 
the elements of the bigger set that are larger than 1 and only divisible by them- 
selves and by one. A fundamental theorem of arithmetic establishes that any natural 
number larger than one can always be represented as the product of the powers of 


prime numbers, and this representation (factorisation) is unique. For example: 


20 = 27-5 
63 = 32-7 
1,050 = 2-3-52-7, 


All prime numbers except for 2 are odd. The only two consecutive prime num- 
bers are 2 and 3. Odd consecutive prime numbers, that is, those that are just 2 apart, 
(for example, 17 and 19), are called twin prime numbers. Mersenne and Fermat 
primes are also of particular interest. 

A prime number is Mersenne prime if, when added to 1, the result is a power 


of 2. For example, 7 is a Mersenne prime number since (7+1=8 = 23). 
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The first eight Mersenne prime numbers are therefore: 


3; 7; 31; 127; 8,91; 131,071; 524,287; 2,147,483,647 


Today we know of only 40 or so Mersenne prime numbers. The largest is a gi- 
gantic number: 2*'* —1, , discovered in 2008. By way of comparison, the estimated 


number of elemental particles in the entire universe is less than 2™. 
For his part, Fermat’s prime number is a prime number in the form of: 


F =2?' +1, , with n being a natural number. 


We only know five Fermat primes: 3(n=0),5(n=1), 17 (n= 2), 257 (n= 3) and 
65,537 (n=4). 


Fermat’s primes carry the name of the illustrious French jurist and mathemati- 
cian who discovered them, Pierre de Fermat (1601-1665). The Frenchman made 
numerous and important additional discoveries relative to prime numbers. One that 


stands out is Fermat’s little theorem, which establishes that: 
If p is a prime number, then for any integer a? = ain mod p. 


That result is of great importance in modern cryptography, as we shall now see. 


From Euler to RSA 


Another result of great interest in modular arithmetic is that known as Bézout’s 
identity. The identity establishes that if a and b are positive integers, the equation 
ged (a,b)=k is equivalent to there being two whole numbers, p, q that satisfies: 


pat qu=R. 


In the particular case, that gcd (a,b)=1 we can claim that there are whole 
numbers p and q such that 


pat qb=1. 
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If we work in modulus n, we can establish that if gcd (a,n)=1 then there are, 
necessarily, whole numbers p and q such that pa+gqb=1. From the supposition of 
modulus n we hold that ga=0 from which we conclude that there is a p such that 
pa = 1, that is, the inverse of a in modulus n exists and is p. 

The number of elements with an inverse in modulus n will be, then, the number 
of natural numbers a smaller than n that fulfil gcd (a,n) = 1. This group of numbers 


is known as Euler’s Formula and is denoted as 9(n). 


If the factorisation of n in prime numbers is n= P;'p,’-»p,', then: 


O(n) = fi *) {14 +} 
P, Py 


If, for example, #7 = 1,600 = 2°52 we will have: 


1 1 
otian)=1.m (1-5 [ - | = 640. 


Fine-turning even more, if the situation is that n is prime, we get that for any 
value of a the gcd(a,n)=1 and, consequently, any value of a will have an inverse 
modulus n, and, therefore, @(n)=n—1. 

Let us take a moment and recall the most important conclusions we have reached 
so far: 

1) @(n) is called Euler’s Formula and indicates the quantity of numbers smaller 

than n that are prime with n. 
2) If n= pq with p and q being two prime numbers, then 


P(n) = (p—1)(q-1). 


3) From Fermat’s little theorem, we know that if a is a whole number greater 

than 0 and p is a prime number we will have the relation a? =a (mod. p), that 

is the same as affirming that a’-! = 1 (mod. p). 

All that’s left is to add the final piece, which is provided by Euler’s Formula. 
Euler affirms that: 


4) If gcd (a,n) =1, then we verify the equation 4 “ =1 (mod. n). 
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Why does the RSA algorithm work? 


Armed with the knowledge expressed above, we are ready to show the mathemati- 
cal arguments that underlie the ciphering process of the RSA algorithm. 

The algorithm in question encrypts a numerical representation m of any message 
whatsoever, with p and q, two prime numbers, and n= p-q. We call e any value that 
verifies that gcd(e,@(n))=1 and we call d the inverse of n in modulus @(n) [that we 


know exists since gcd (e,@(n)) = 1]. So: 
d-e=1mod. (n). 


The ciphered message, M, is ciphered according to M =m* (mod. n). The algo- 
rithm presupposes that the original message m is obtained by m= M4 =(m*)4 (mod. 
n).Verifying this equation is equivalent to demonstrating the validity of RSA.To do 


this, we combine Fermat’s theorem with Euler’s formula. 
Let’s consider two cases: 
1) If (m,n) =1 according to Euler’s formula m?") =1 (mod. n). 
We start from the relation that is equivalent toe-d-1=0 (mod. (n)), that is, 


there is a value k, whole, such that e-d-—1=k-@Q(n), that is, e-d-1=k-Q(n)+1. With 


this, applying Euler’s formula, we have the equation: 


(me)4 = met =m* + = mk (+m =(m “)-m = 1*+m (mod. n) = 


= m (mod. n). 


This is the result we were seeking. 


2) If gcd (m,n) #1,and n= p-q,m will contain as factor only p, only q, or both 


simultaneously. 
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a) m will be a multiple of p, that is, there is a whole number r such that m=r- p. 
Therefore ™“ = 0 (mod. p), and finally:m* =m (mod. p), in other words, there is a 
value of A so that: 

mé —m=Ap. (1) 


In the second case, 


b) we have that 


(m* ¥ = me? = yk (etl = yk () ‘m= (m (n) m= 


= (m(q-") )kp-!) +m (mod. n) =m. 


Since gcd (m,n) = p the (m,q) =1 and by Fermat’s theorem 
ma) = 1 (mod. q). 


Applied to the initial equation: 


(me? ¥ =med = me (n+l = m* (*) ‘m= (m (nm) )k ‘m= 


= (m‘t-) )&P-)) +m (mod. n) = 1*2-) m = m (mod. q), 
from which we conclude that there is a value of B such that 


m* —m= Bag. (2) 


From expressions (1) and (2) we can affirm that p-g=n divides m* —m, therefore 
m* —m = (0 (mod. n). 


The process is analogous if we consider q. In the case where m is the multiple of 


both p and q simultaneously, the result is trivial Consequently, 
(m*)¢ =m (mod. n). 
The cipher of the RSA algorithm is thus demonstrated. 
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Coding and encryption 


The safety and confidentiality of communication in the digital 
world depends on a complex code designed by mathematics. 
This book offers a stimulating journey through the arithmetic 
of security and secrecy, introduces you to the encryptors and 
decryptors who determined the destiny of nations and uncovers 


the language through which computers communicate. 


